An efficient memristive alternating crossbar array and the design of full adder

Memristor is one of the most promising emerging technologies to solve the von Neumann bottleneck problem due to its non-volatile and binary characteristics. This paper studies the design method of high-efficiency logic circuit based on memristor. First, a multiple-input-multiple-output (MIMO) logic circuit design scheme based on IMPLY and AND logic is proposed, which can derive multiple new efficient logic operation methods and complete complex logic with fewer steps and memristors. Second, in order to perform rapid interactive operations between different rows, an alternating crossbar array structure is designed which can quickly complete cross-row logic operations. Finally, a high-efficient full adder (FA) based on MIMO logic and alternating crossbar array is proposed. To accomplish 32-bit add operation, the proposed FA needs 160 memristors and only 41 steps. Compared with the state of art FA, our work has faster execution speed and fewer memristors.


Introduction
Memristor was first proposed by Leon Chua in 1971 [1,2] and realized in physical form in 2008.The most outstanding characteristic of the memristor is nanoscale dimensions, nonvolatile, low power and multi-state programming.Thus, the early application research of the memristor is nonvolatile memory [3][4][5].With the development of research, the application has been extended to neuromorphic network [6,7] and chaotic circuit design [8][9][10][11][12].Especially in the design of logic circuits, the memristor can realize the processing in memory (PIM) architecture and is considered to be one of the most promising candidate technologies to break through the von Neumann bottleneck [13][14][15][16].
According to the components, the logic circuits based on memristors can be divided into two categories.One is the hybrid CMOS-memristor-based logic, which mixes the CMOS logic and memristive logic, such as the memristor ratioed logic (MRL) [17] and memristorbased threshold logic gates(TLGs) [18].The other one is memristor-only logic, which includes stateful logic and sequential logic.The characteristic of sequential logic [19] is that input and output can be represented by different variables.While the logic input and output of stateful logic [20] are both represented by resistance, such as Memristor Aided Logic (MAGIC) [21] and Material Implication (IMPLY) [22,23], which are both built based on crossbar arrays structure.IMPLY is simple and reliable and has the ability to implement complete Boolean logic operations [24], and therefore is more widely applied [25][26][27].As shown in Fig. 1, it is the I-V characteristic curve of the ideal memristor model.V CLOSE and V OPEN represent the positive threshold voltage and negative threshold voltage of the memristor, respectively.V SET is slightly larger than the positive threshold voltage, when acting alone, the resistance state of the memristor will be switched to the low-resistance-state (LRS) R ON , representing a logic 1. V CLEAR is slightly smaller than the negative threshold voltage, when acting alone, the resistance state of the memristor will be switched to the high-resistancestate (HRS) R OFF , representing a logic 1. V COND and V COND do not reach the threshold voltage, so the resistance state of memristor will not be changed when they act alone.The IMPLY logic circuit can be built by using the resistance variation characteristics of memristors.The circuit and truth table of an IMPLY logic is shown in Fig. 2 and Table 1.The resistance of the memristor represents the logical state, where low-resistancestate (LRS) R ON is considered as logic 1, and highresistance-state (HRS) R OFF is considered as logic 0.
However, there are two problems when IMPLY logic is used for complex logic operations.First, the execution efficiency of IMPLY is low, and more operation steps are required when performing complex logic operations.Second, the input of IMPLY will be overwritten.As shown in Fig. 2, IMPLY logic has only two computational memristors, while logical operations generally have two or more operands.Therefore, we have to input the original data into memristor Q, whose logic state will be overwritten by the operation result.If the input value is required to be used again, it   has to be transferred to another place before the operation.
In order to solve the above problems, Shin et al. [28] proposed a high-fan-in NOR gate.It can execute multiple implications simultaneously in one step.Later, Huang et al. [29] proposed a 3M1R NAND logic and AND logic.This structure contains 2 input memristors and 1 output memristor, and its input value will not be overwritten and the execution efficiency will be improved.Siemon et al. [30] proposed an ORNOR logic gate, which includes 3 input memristors, and 1 as output at the same time.This logic can complete NOR logic and OR logic in 1 step.
The computing system includes a number of fundamental blocks.The efficiency of these fundamental blocks has an important impact on the execution effi-ciency of the computer system.Among them, the FA is one of the most frequently used blocks.Therefore, the design of FA circuit based on memristor has become a hot research topic.
The parallel approach is a common structure in FA design.As shown in Fig. 3, each bit represents a different row with its related working memristors, and different rows can execute simultaneously, so this approach can effectively save operation steps.However, the number of memristors is greatly increased in parallel structure.For example, the 32-bit FA proposed by Kvatinsky [26] needs 178 operation steps and 288 memristors.3M1R-based 32-bit FA [29] needs 104 operation steps and 288 memristors.ORNOR-based 32-bit FA [30] needs 79 operation steps and 198 memristors, better than the above two.
Serial FA [31] is another common structure, as shown in Fig. 4, all memristors are connected to the Fig. 3 The n-bit FA using parallel structure.Each row calculates 1 bit.A i and B i are addend and summand.M i, j is the register during operations.C i is a carry.And the standard sum can be stored anywhere except C i , depending on the specific algorithm of the designer Fig. 4 The n-bit FA using serial structure.All calculations are on the same row.A i and B i are addend and summand.M 1 and M 2 are the registers during operations.C is a carry.The out result is saved in A i or B i ground via a resistor.Voltage V i is applied to the corresponding single memristor, and each time only one operation can be executed.The advantage of this structure is small area, but at the cost of increasing the total steps.For example, a 32-bit serial FA requires only 67 memristors, but 704 steps.Then a semiparallel structure FA is proposed [27], which has the same number of memristors as the serial method, but the steps of the 32-bit FA are reduced to 544.Another improved FA is the semi-series structure [32].In this structure, each bit is calculated serially, but the working memristor is arranged in a separate third part, thereby achieving part parallelism, for a 32-bit FA, which needs 70 memristors and 322 steps.Above all, compared with parallel approach, the improved serial method will still require more steps.
To make calculations more efficient, based on Shin's work [28], we propose a multiple-input-multipleoutput (MIMO) logic circuit design scheme.The MIMO scheme contains multi-input logic and multioutput logic, which exhibits high computational efficiency and provides data reusability.Then, to improve the efficiency of memristors interaction between different rows in the traditional crossbar array structure, we propose an alternating crossbar array structure, which makes it possible to complete the carry operation in FA in one step.Finally, a highly efficient FA based on MIMO logic circuit design scheme and alternating crossbar array structure is proposed.Compared with the existing FAs, the proposed FA greatly reduces the operation steps in all FAs and the number of memristors in parallel FAs.Because of fewer memristors and operation steps, it consumes less power when calculating the addition operation of the same number of bits.
The main contributions of this work are as below: 1.The proposed MIMO logic circuit design scheme can improve the efficiency of computation, avoid the overwriting of inputs and extend the application of output.2. The proposed alternating crossbar array structure has higher computational efficiency than traditional crossbar array in cross-row logic operations.3. The proposed FA is faster than all other designs, requires less area than other parallel designs, and has the reusability of data.
The rest of this paper is organized as follows: In Sect.2, we introduce the memristor model and two basic memristive logic.In Sect.3, the proposed MIMO logic circuit design scheme is described in detail.The proposed alternating crossbar array structure and FA are presented in Sect. 4. Section 5 shows the correctness of our design through PSpice simulation.In Sect.6, comparisons between different FAs are presented.The paper is summarized in Sect.7.

Memristor model
This paper uses the memristor model as drift speed adaptive memristor (DSAM) [33].The DSAM model is based on three main characteristics, namely, linear I-V relationship, drift speed adaptive control, and a voltage threshold, which can well describe the characteristics of you and the ideal memristor.By adjusting the fitting parameters, a variety of state variable curves are provided, which makes it possible to describe different memristors.Moreover, it can simultaneously satisfy boundary validity, scalability, nonlinearity, and solve the problem of boundary lock.The DSAM model is very sensitive and accurate to the conduction curve under impulse excitation.This allows us to reach an ideal state in logical operation.Our work is mainly to use the binary characteristics of memristor to carry out logical operations.We adjusted the parameters of this model and found that various logic operations can be realized quickly and stably under the action of excitation voltage.And its I-V relationship is: (1) where the state variable x represents the normalized broadband of the conductive area, and its range is [0, 1].In addition, ΔR = (R OFF − R ON ), and R OFF and R ON represent the high resistance state (HRS) and low resistance state (LRS) of the memristor.And the corresponding state variable is x = 1 and x = 0. Therefore, the derivative of the state variable x can be expressed as follows: where Among them, v on and v off represent the positive and negative threshold voltages, a and p are curve fitting parameters, k on and k off are linear adjustable parameters.
Generally, LRS is considered as logic 1(close), and HRS is considered as logic 0 (open).The I-V characteristic curve of the ideal memristor model is shown in Fig. 1, where When the applied voltage is greater than the positive threshold voltage V CLOSE of memristor, the memristor switches from state 0 to state 1; when the applied voltage is less than the negative threshold voltage V OPEN of memristor, the memristor switches from state 1 to state 0.

IMPLY logic operation
P IMPLY Q is a logic operation called material implication (IMPLY), which can be realized by memristive IMPLY logic operation circuit.As shown in Fig. 2, P and Q are two memristors, which are connected to the resistor R G through the horizontal nanowire L. R G is grounded.In order to understand the principle, we simply assume that the parameters of memristors need to satisfy R ON R G R OFF .By applying two fixed voltages V SET and V COND to memristor P and Q respectively through three-state buffer, IMPLY operation can be achieved, which is q = p + q.The logical value of memristor Q is replaced by the operation result, and the value of memristor P stays unchanged.
When the state of memristor P is HRS (p=0), the voltage on matter what the state of memristor Q is, it will convert to LRS (q = 1).The above analysis corresponds to case 1 and case 2 of Table 1.
In this case, Q will remain in its original state.The above analysis corresponds to case 3 and case 4 of Table 1.
Above is a brief introduction to the principle, and next we will strictly deduce the formula.According to Kirchhoff's Current Law, the voltage dropped in the memristors P and Q are: In order to ensure that the function of IMPLY logic operation is correct, in case 1, V Q must be greater than V CLOSE , and in case 3, V Q must not be greater than V CLOSE .Then we get two inequalities, for case 1: and for case 3: In any case, the resistance state of the memristor P remains unchanged.Replace R P and R Q in (4) with the correspond state in case 1-4, respectively.Then we can get four inequalities, for case 1: The inequalities of the other 3 cases can be obtained according to the same principle.
By combining V COND < V CLOSE < V SET with the above inequalities, the expressions of the upper and lower bounds of R G are obtained as: where Therefore, if inequalities 8, 9 and 10 are satisfied, IMPLY operation can be realized.We notice that IMPLY operation can also be achieved when

AND logic operation
The principle and circuit structure of AND operation are similar to IMPLY operation.The truth table of AND is shown in Table 2.When applying two fixed voltages V COND and V CLEAR to the memristor P and Q, respectively, AND operation (q = p • q) can be achieved.The logical value of memristor Q is replaced by the operating result, and the state of memristor P remains unchanged.
In order to make the function of AND logic operation correct, in case 2, V Q must be less than V OPEN , and in case 4, V Q must be not less than V OPEN .The resistance state of memristor P remains unchanged in any circumstances.Similar to IMPLY logic, the value range of R G is where

MIMO scheme based on memristive logic
In this section, a systematic and more efficient MIMO memristive logic scheme is proposed.The MIMO scheme consists of two parts: multi-input logic and multi-output logic.Each part contains two improvements to the basic memristive logic.Applying them to the design of complex logic will effectively improve the computational efficiency and save the original input.

Circuit model of MIMO logic
IMPLY and AND logic have the same circuit structure but different excitation voltages.In Fig. 2, memristor P and Q are regarded as input memristor and output memristor, respectively.

Multi-input logic
Figure 5a shows the circuit model of multi-input logic, in which the input memristors can be extended to n.When {V 1 , V 2 } = {V COND , V SET }, corresponding to Multi-input IMPLY logic, the logic operation can be expressed as q = ( p 1 + p 2 + . . .+ p n ) + q.When {V 1 , V 2 }={V COND , V CLEAR }, corresponding to Multiinput AND logic, the logic operation can be expressed as q = ( p 1 + p 2 + . . .+ p n ) • q. q is output, and is set to 0 or 1 according to the logical operation.Since Multi-input logic has more than one input memristor, we can complete operations involving multiple data in one step.Using Multi-input logic to design complex logic can effectively reduce computation time and the number of memristors.When we input the original data only to memristors P 1 ∼ P n (memristor Q does not store the input data), the data can avoid being overwritten by operation result as what IMPLY and AND logic do.So the Multi-input logic has data reusability.

Multi-output logic
Figure 5b shows the circuit model of Multi-output logic, in which the output memristors can be extended to n.The initial logic values of Q 1 , Q 2 up to Q n should be ensured the same before calculation.The Multioutput logic operation results are stored on n memristors.When {V 1 , V 2 } = {V COND , V SET }, corresponding to Multi-output IMPLY logic, the logic operation can be expressed as q 1 = q 2 = . . .= q n = p + q.When {V 1 , V 2 } = {V COND , V CLEAR }, corresponding to Multioutput AND logic, the logic operation can be expressed as q 1 = q 2 = . . .= q n = p • q.
By using Multi-output logic, the operation results can be stored in multiple memristors, which is convenient for the output data to participate in different operations and improves the execution efficiency.In addition, if the specific conditions of R G are satisfied, the Multi-input and Multi-output logic operations can be realized in the same structure at the same time.

Constraints of MIMO logic
To execute MIMO logic correctly, circuit parameters should satisfy the following constraints.

Multi-input logic
The parallel resistance value of input memristors is defined as R i .When the resistance states of all input memristors are HRS, the parallel resistance is defined as logic 0, and the rest cases are all defined as logic 1.The resistance corresponding to the logic value is Similar to the calculation method of IMPLY logic, the constraints of the Multi-input IMPLY logic can be obtained as follows: where The constraints of multi-input AND logic can be obtained by the same method.

Multi-output logic
The parallel resistance value of output memristors is defined as R o .Because the logic value of all output memristors should be kept the same before calculation, there are only two cases of R o value.
Similar to the calculation method of IMPLY logic, the constraints of the Multi-output IMPLY logic can be obtained as follows: The constraints of the multi-output AND logic can be obtained by the same method.

ONO and OA logic
For the convenience of later description, we redefine the 2-input IMPLY and 2-input AND as OR-NOT-OR (ONO) and OR-AND (OA), respectively.The ONO logic can be expressed as q = p 1 + p 2 + q.Its circuit implementation is shown in Fig. 6a, and the true value table of ONO logic is shown in Table 3.The OA logic can be expressed as q = ( p 1 + p 2 ) • q.Its circuit implementation is shown in Fig. 6b, and the true value table of OA logic is shown in Table 4. Figure 7 shows the structure of a traditional crossbar array.In this structure, when the logic operations involve the memristors of different rows, they usually require multiple steps to complete.For example, computing X 1 IMPLY Y 2 takes two steps.First, X 1 is copied into Y 1 by horizontal AND operation, and then Y 1 IMPLY Y 2 by vertical operation.
To achieve rapid data interaction between different rows in a crossbar array structure, as shown in Fig. 8, an alternating crossbar array structure is proposed.It differs from the traditional structure in two points.First, the memristors in columns 1 and 2 are placed alternately to avoid interference from the memristors of adjacent rows.Second, a column of switches is added to isolate the interference of the same row memristors.By controlling switches S i and K i , logical operations in different rows and columns can be realized quickly.
For a better understanding, here is an example.In Fig. 8, the IMPLY logical operation is performed on X 1 and Y 2 .X 1 is in row 1, column 4, and Y 2 is in row 2, column 2. The following operations need to be performed simultaneously: • By closing switches S 1 and S 2 and opening other S-Series switches, rows 1 and 2 are selected.• By closing switch K 1 and opening other K -series switches, the part behind K 1 in row 1 is connected, and the part behind K 2 in row 2 is separated.• V SET is applied to column 2 and V COND is applied to column 4, so that the memristors selected for the IMPLY operation contain only X 1 and Y 2 .
Therefore, the IMPLY operation between adjacent rows in an alternating crossbar array can be completed in one step, and there is no need to move the two operators to the same row or the same column.When the calculation involves the memristors of adjacent rows, the alternating crossbar array is faster than the traditional crossbar array.The application of alternating crossbar array makes it possible to complete the carry operation in FA in one step, which effectively improves computational efficiency.

Proposed FA
In this section, the algorithm and structure of FA are proposed based on the MIMO logic and alternating crossbar array mentioned above.
The basic ADD logic is described as follows, A i is the addend, B i is the summand, C i−1 is the carry-in from the adjacent lower bit, S i is the sum of the current bit, and C i is the carry-out to the adjacent higher bit.The truth table of one-bit Adder is shown in Table 5.
Sum (S i ) and carry-out (C i ) are calculated by: where The proposed FA circuit is shown in Fig. 9.The scale of the circuit can be expanded to n bits.Only two bits of the FA are shown here for the convenience of illustration.
In the i-th bit, memristors A i and B i store the input logic values.Memristor C i−1 stores the reversed carryout, which comes from the adjacent lower bit.Memristors M 1 and M 2 are used to store the intermediate process.The control circuit, which is not discussed in this paper, enables all inputs to occur simultaneously.Memristor C i is located in the (i+1)-th bit, which is used to store the reversed carry-out of the i-th bit.In our algorithm, the carry-out of each bit is the reversed logical value, and the sum of current bit is stored in the memristor M 2 .
The implementation of crossbar array of the proposed FA is shown in Fig. 10.In a large-scale array, we need to ensure that there are memristors placed alternately on both sides of the array and at least four columns of memristors in the middle.The structure in the dotted box corresponds to Fig. 9.The memristors of the first and last columns in the crossbar array are alternating, corresponding to the alternating crossbar array.Resistors R G and switches S i are also alternating in order to complete the carry operation correctly.Some switches in Fig. 9 are not marked because they are always open in the algorithm in this paper.We use the 1T1R crossbar arrays [4] to solve the sneak current issue.The detailed connections of word line W , bit line B and control signal C are shown in Fig. 10.
The operations of each step and the state of the memristors after the operations are recorded in Table Table 6 The implementation of i-th FA Step Operation Voltage The logical value after operation in: i represents the bit position of the FA Fig. 9 The proposed FA circuit Fig. 10 The implementation of 1T1R crossbar array of the proposed FA And in step 5, in the serial operation mode, the position of the memristor storing the carry-in and carry-out depends on the bit position i of the FA.Therefore, the voltage values of V 1 and V 6 depend on whether i is odd or even.

According to
In order to show the working mode of this work in the case of n-bit, 10 operating steps are divided into four phases: Phase 1: steps 1 and 2. Close switches S 2 ∼ S n .Open switches S 1 and H 0 ∼ H n .Note that S 1 needs to be open because we need to keep the original carry-in C 0 unchanged in step 1.The n-bit FA can simultaneously calculate steps 1 and 2 in parallel.In this phase, the n-bit FA needs only 2 steps in parallel calculation.
Phase 2: steps 3 and 4. Close switches H 1 ∼ H n .Open switches S 1 ∼ S n .In this way, C i can be connected to W i without affecting other bits, so data transmission can be realized between the i-th and (i+1)-th bit positions.Similar to Phase 1, the n-bit addition can be computed in parallel to complete Phase 2 in parallel calculation with 2 steps.Phase 3: step 5. Close switches S i and H i .Open all other switches.To calculate the carry-out of the i-th bit position, we must first get the carry-in from the i-th.Therefore, the n-bit FA can only complete Phase 3 by The operation diagram of the n-bit FA is shown in Fig. 11.In Phases 1 and 2, all bits execute in parallel, which takes 2 steps; in Phase 3, each bit is executed in serial immediately after the previous bit, which takes n steps; in Phase 4, all bits execute in parallel as in Phases 1 and 2, which takes 5 steps.
To sum up, for the n-bit FA, the proposed design requires 2+2+n +5 = n +9 steps.In this part, we propose the design of FA and make a preliminary analysis.As shown in Table 6, the Multi-input and Multi-output logics are used multiple times in the FA.It can reduce the calculation steps.The application of the alternating crossbar array enables the carry operation to be completed in one step, which greatly improves the computational efficiency of the FA.
In addition to improving the computational efficiency, our algorithm and structure also reduce the number of memristors and make our design have higher integration.Furthermore, instead of performing IMPLY or AND logic operations on the original input memristors, we use Multi-input logic operations with data reusability to retain the original input data.So the proposed FA also has data reusability.Comparisons of speed, area, power consumption, and data reusability are presented in Sect.6.

The peripheral circuit
Because of the need of the algorithm, V 1 ∼ V 6 need to input different voltages in different operations.To realize this operation, this paper has written a periph-eral circuit program circuit by Verilog HDL language, which can output different pulse voltages at different times.For example, we can get from Table 6 that the voltage on the bit line V 1 has five possibilities −1.2 V, −0.8 V, 0 V, 0.8 V, 1.2 V at different times.Add five switched voltages to V 1 , which are −1.2V, −0.8 V, 0 V, 0.8 V, 1.2 V, respectively.As long as the clock control switch of Verilog HDL is used, the input of different voltages can be achieved.After many simulation tests, the step time of 10 ns can realize the function of the corresponding steps.Other switches in the circuit are also connected to the peripheral circuits on both sides and are similarly controlled by Verilog HDL clock.
However, there is a small problem with this peripheral circuit.The periphery of adders with different bits cannot be used universally.To solve this problem, we propose two methods.One is that it can be set as a fixed adder and the other is to design multiple peripheral circuits, and choose different peripheral circuits when making corresponding different bits calculations.

Simulation
To verify the correctness of MIMO logic and the calculation accuracy of FA, we simulated them with PSpice.The DSAM memristor model is used in the design, and the main parameters that constrain the implementation of logic operations are listed in Table 7.According to (9-14, 16-18, 20-22), the difference between R OFF and R ON has a great influence on IMPLY, Multi-input IMPLY, Multi-output IMPLY logic, while the difference between V CLEAR and V COND has a great influence on AND, Multi-input AND, Multi-output AND.The parameters can be set according to actual needs.There are many combinations of parameter selection, not only the one shown in Table 7.As long as the range of R G is reasonable, the logic operation can be realized.
Unlike ideal models, there is resistance drift in real memristors.The resistance varies with the voltage and does not exhibit a perfect threshold characteristic during the transition from LRS to HRS and from HRS to LRS.Compared to the ideal R G range, the nonideal characteristic results in a smaller R G range for IMPLY, ONO, Multi-output IMPLY logic operations and a larger R G range for AND, OA, Multi-output AND logic operations.Our simulation results show that the two types of logical operations mentioned above func-  Resistance drift is a serious problem in memristive circuit which can affect the accuracy of the logic operation.In order to ensure that the subsequent operation can be correctly calculated, in some cases, the resistance value of the intermediate result memristor must be refreshed to make it return to R ON or R OFF .The specific implementation measures can be referred to [34].
Figure 12 shows the simulation result of ONO logic and OA logic.In Fig. 12, different rows represent the different combinations of initial states of P 1 and P 2 .Columns 1 and 2 represent the changes of resistance values of memristors P 1 and P 2 .Columns 3 and 4 represent the change of resistance values of memristors Q at different initial states (0 or 1).We can see that input memristors P 1 and P 2 remain unchanged, but Q changes depending on the inputs.Figure 13 is the simulation result of two-output IMPLY logic and AND logic.Different rows represent the different combinations of initial states of P, Q 1 and Q 2 .The resistance value of P remains unchanged, and the resistance values of Q 1 and Q 2 are always the same.All of the simulations gain the correct results.
The result of one-bit FA is shown in Fig. 14, showing both {A i , B i , C i−1 } = {0, 0, 0} and {A i , B i , C i−1 } = {1, 0, 0} cases.Figure 14 records the resistance changes of all memristors in FA.C i is not displayed after step 5 because it gets the carry-out and will participate in the calculation of the next bit.We can see the result: for Fig. 14a, M 2 is LRS, which means sum=1, and C i is LRS, which means carry-out=0; for Fig. 14b, M 2 is HRS, which means sum=0, and C i is HRS, which means carry-out=1.The state of the input memristors does not change, which shows the data reusability.
Figure 15 is the result of Verilog HDL simulation when we design the peripheral circuit.Our design idea is that each input can be connected to a different voltages through a switch, and then the switch can be turned on and off by a clock, so that different pulse inputs can be realized.Because it is controlled by the clock, the step time of each operation step is fixed in the design, and the step time we set is 10 ns. Figure 15 is a simulation diagram of the input pulse of V 1 .From the operation in Table 6, we can get that the input voltage of V 1 may be −1.2V, −0.8 V, 0 V, 0.8 V and 1.2 V when different operations are carried out.Assume that the switch K1 is connected with the voltage of −1.2 V, and K2, K3, K4 and K5 are connected with the voltages of −0.8 V, 0 V, 0.8 V and 1.2 V, respectively.In the first step of the add operation, it can be seen from Fig. 15 that the switch K1 is open, and the input voltage at this time is the pulse voltage of −1.2 V, and so on.To have a better understanding of the advantages and disadvantages of the proposed FA, we compared our design with other existing works.In Table 8, we list some characteristics and indicators of different FA designs.The percentage improvement (Imp.) is calculated based on (P other − P our ) /P worse × 100%, where P other (P our ) represents the considered characteristic of the other designs (our design) and P worse is the worse value of the two.

Calculation speed
Speed plays an important role in determining the merit of a design and its potential for widespread use and implementation.The proposed MIMO logic-based FA design needs only 41 steps to implement a 32-bit addition.It is 76%, 94%, 92%, 87% faster than IMPLY based FA in [26,27,31,32], respectively.It is 60% faster than the 3M1R-based parallel FA in [29] and 48% faster than the ORNOR-based parallel FA in [30].

Number of memristors
The number of memristors reflects integration and cost to some extent.Even if the peripheral circuit accounts for a large proportion of the area, the reduction of the number of memristors will lead to higher integration and less cost as the number of bits of FA increases.Our design, for 32-bit addition, requires 160 memristors, 44% less than the IMPLY-based [26] and 3M1R-based [29] parallel design, and 19% less than the ORNOR-based design [30].The number of memristors in [27,31] and [32] are nearly 58% less than our design.This is because their designs are serial, semiparallel or semiserial, which are characterized by high integration at the expense of computation speed.

Data reusability
Non von Neumann structure is dedicated to processing in memory, which effectively reduces the transmission of data.However, the use of IMPLY logic in Refs.[26,27,31] and [32] results in the replacement of the original input data.3M1R and ORNOR logic have data reusability, but designs [29] and [30] do not avoid replacing the original data in the calculation process.The loss of original data means that the data cannot be used again after the calculation.A solution is to retransmit data to another unit [35], but it will increase the time required for data transfer and reduce the efficiency of the operation.In our design, by using the MIMO logic, the memristors A i , B i and C i−1 do not change state during calculation after input.The input data can be kept in its original state and be used again.The reusability of input data can reduce the data transmission time, improve efficiency effectively, and promote the combination of storage and computation.

Power consumption
Power consumption is a very important index to measure the quality of adders.Although this work increases the number of switches, it greatly reduces the number of memristors and the operation steps of the algorithm.Therefore, compared with other adders, the power consumption of the adder proposed in this paper is relatively small.We select two very representative adders [30] and [32], and when we select the same mos transistor as the switch and perform the addition operation with the same number of bits, our power consumption is 22% less than that in Ref. [30] and 39% less than that in Ref. [32].

Conclusion
This paper proposes a MIMO design scheme based on memristive logic and an alternating cross-array struc-ture.The MIMO scheme consists of two parts, Multiinput and Multi-output logic, which have high computational efficiency and provide data reusability, and have a good prospect in the design of complex logic.Alternating cross-array structures can perform rapid interactive operations between adjacent rows, which avoid the burden of operators being in the same row or column.
Then, a fast FA design was proposed based on MIMO logic and alternating crossbar array structure.The proposed FA is superior to the existing parallel design in both area and speed.Although serial, semiparallel, and semiserial designs are smaller in area than ours, our calculation speed is much faster than theirs.In addition, the proposed FA has another advantage, data reusability, which is not available in any of the other designs.Data reusability, which may reduce data loss and data transmission time, is important in the structure of processing in memory.In addition, the alternating cross-array structure can be used for any other complex operations to reduce data movement steps.Although the adder designed in this work increases the number of switches, it greatly reduces the number of memristors and the steps to realize the algorithm operation.Compared with other works, our proposed adder has relatively less power consumption.
Finally, in order to realize the arithmetic operation of the adder, we design a peripheral circuit to control the voltage of the switch and memristor through Verilog HDL.We design the corresponding step time, and control the pulse voltage through the clock when the memristor can reach a stable value.At different times, the voltage that meets the conditions is output, and the arithmetic operation of the adder is realized.Data Availability Data sharing is not applicable to this article, as no datasets were generated or analyzed during the current study.

Declarations
Conflict of interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Fig. 5 a
Fig. 5 a Circuit model of Multi-input logic operation.b Circuit model of Multi-output logic operation

Fig. 6 a
Fig. 6 a Circuit implementation of ONO logic operation.b Circuit implementation of OA logic operation

Fig. 7 Fig. 8
Fig. 7 IMPLY operation of different rows in traditional crossbar array structure

Fig. 11
Fig. 11 The operation diagram of the n-bit FA

Fig. 12 Fig. 13
Fig. 12 The change of each memristor resistance in ONO logic (a) and OA logic (b)

Funding
work was supported by the National Natural Science Foundation of China (62171182), the Natural Science Foundation of Hunan Province (2021JJ3014) and the Natural Science Foundation Project of Chongqing, Chongqing Science and Technology Commission (CSTB 2022NSCQ-M SX0770).

Table 1
Truth table of IMPLY logic operation

Table 2
Truth table of AND logic operation

Table 3
Truth table of ONO logic operations

Table 5
Truth table of OA

Table 6
, one-bit FA calculation needs 10 steps.The excitation voltage not mentioned in each step in the table is 0. Steps 2, 5, 7 and 10 use Multi-input logic, and steps 3 and 4 use Multi-output logic.Note that in steps 1, 3, 4, 9 and 10, both V 1 and V 6 apply the same voltage.Because in parallel operation mode, both columns 1 and 6 can store carry-in and carry-out.

Table 7
Memristors and circuit parameters considered in the simulations

Table
Comparisons Between the Proposed FA and Other Works