Performance analysis and evaluation of ternary optical computer based on asynchronous multiple vacations

Ternary optical computer (TOC) has become a research hotspot in the field because of the advantages such as inherent parallelism, numerous trits, low power consumption, extendibility, bitwise allocability and dynamical bitwise reconfigurability. Meanwhile, its performance evaluation attracts more and more attentions from potential users and researchers. To model its computing ecology more accurately, this paper first builds a three-staged TOC service model by introducing asynchronous multi-vacations and tandem queueing, and then proposes a task scheduling algorithm and an optical processor allocation algorithm with asynchronous vacations of some small optical processors after dividing equally the entire optical processor into several small optical processors which can be used independently. At the same time, the analytical model was established to obtain important performance indicators such as response time, the number of tasks and utilization of optical processor, based on M/M/1 and M/M/n queuing system with asynchronous multi-vacations. In addition, relevant numerical simulation experiments are conducted. The results illustrate that the number of small optical processors, vacation rate and the number of small optical processors allowed to be on vacation have important effects on the system performance. Compared with synchronous vacation, asynchronous vacation not only ensures the system to obtain better maintenance but also improves the system performance to some degree.


Introduction
With the development of science and technology, especially artificial intelligence, people express higher requirements for computing power. Scientists have made the electrons travel a shorter distance in a shorter time for increasing the speed of a single CPU by miniaturizing electronic components to very small micron sizes. At the same time, they have also developed the cloud computing platform such as the Amazon cloud, Google cloud, Ali cloud and a new batch of supercomputers such as Summit, Sierra, Sunway Taihu Light, to improve the computing power of single computer. However, these technologies and strategies have not fundamentally changed the inherent bottlenecks of the electronic computer such as high energy consumption, low bandwidth and high latency. Therefore, many researchers have been exploring new types of computers, such as DNA computing, optical computing and quantum computing.
Among these novel computers, optical computers have been preferring in recent decades because of their low energy consumption, high bandwidth, free interconnect in three-dimensional space and high parallelism. For example, in the early 1970s, Heinz et al. mathematically investigated and studied matrix multiplication utilizing coherent optical correlation techniques and optical analog methods (Heinz et al. 1970 summarized the 60-year adventure of optical computing (Ambs 2010). In 2017, Zangeneh-Nejad et al. designed and realized a reconfigurable and highly miniaturized analog optical differentiator by using a half-wavelength plasma graphene film based on graphene-supported plasma wave characteristics (Zangeneh-nejad et al. 2018). In 2018, Rashed et al. developed an optoelectronic converter to realize all-optical computing operations based on nonlinear metamaterials (Rashed et al. 2019). In 2019, Zhou et al. implemented a multi-layer spatial optical differentiator by designing a deep neural network which can predict the reflection coefficient of the 12-layer film (Zhou et al. 2020). Consequently, optical computing remains a common focus. Jin pioneered a ternary optical computer (TOC), which expresses information in ternary optical states, including no intensity light (NIL), vertically polarized light (VPL) and horizontally polarized light (HPL). A lot of significant achievements have been obtained. For instance, in hardware, Jin et al. proposed the TOC principle and its structure in 2003 (Jin et al. 2003(Jin et al. , 2005. In 2008, Yan et al. put forward the decrease-radix design principle (DRDP) (Yan et al. 2008), which makes the construction of TOC processor normative and operable. Meanwhile, the processor constructed according to DRDP is dynamically reconfigurable. Multi-generation TOC hardware platforms have been constructed according to this theory. In 2017, the TOC platform SD16 with 192 trits was successfully constructed. In 2018, Jin et al. built an optical processor (OP) with 1152 trits by combining six SD16s (Wang et al. 2020a). To solve the carry delay problem, TOC adder (Shen and Pan 2014;Peng et al. 2014;Wang et al. 2010;Jin et al. 2010;Song and Yan 2012) and multiplier (Song et al. 2019; were designed and realized based on modified signed-digit (MSD) digital system. Literature Zhang et al. (2017) designed and implemented a positive/negative value judger for MSD data for further improving the three-valued optical processor. Literature Ouyang et al. (2016) proposed the structure and theory of dual-space memory to solve the problem of data transmission in TOC. In software, Literature Wang et al. (2011) presented the module structure and inter-module communication protocol of the TOC monitoring system, and (Song and Jin 2011;Jin et al. 2013) deeply discussed its task management system and trit resources. Zhang et al. (2018), Jin et al. (2019), Li et al. (2018) realized the programming application by designing and implementing of the operation-data file SZG in TOC; (Gao et al. 2013) implemented the seamless combination of C language and TOC by successfully transplanting the former to the latter. Meanwhile, the MSD digital multiplication program ) and MPI programming technology (Zhang et al. 2014) were also achieved on TOC. In numerical calculation, parallel carry free addition (Wang et al. 2010), multiplication (Song et al. 2019;, division (Xu et al. 2016) and vector matrix multiplication (Wang et al. 2010) were realized on TOC. Literature Peng et al. (2017) and Peng et al. (2018) achieved fast Fourier transformation (FFT) and discrete Fourier transformation (DFT) on TOC, respectively. Song et al. (2020) implemented the algorithm to solve higher-order derivative by configuring the multipliers and adders on the TOC platform. Zhang et al. (2020) used the cellular automata on TOC to simulate a three-lane traffic flow. In addition, Li et al. (2019) designed and implemented a parallel artificial bee colony algorithm on TOC. These have made the research on TOC from theory to numerical application. It can be seen that some breakthroughs have been made.
However, there are few reports on another important research direction of TOC-performance analysis and evaluation. Therefore, this paper analyzes the QoS(quality of service) performance indicators of TOC, proposes a three-staged service model of TOC, and analyzes and evaluates the performance of TOC based on the queuing theory. We focus on making the following contributions in this paper: • This paper first builds a three-staged service model of TOC by connecting three queues-receiving queue, scheduling queue and transmitting queue-in series based on vacation queuing and tandem queuing after illustrating the computing paradigm of a TOC and its primary modules. • Based on the equal partition strategy of OP (Wang et al. 2020a), we propose a TOC task scheduling algorithm with asynchronous vacations of some small OPs (SOPs), a processor allocation algorithm and a processor recovery algorithm. • We construct some analytical models that evaluates the important performance indicators of TOC, including response time, number of tasks, and OP utilization. In particular, we build the analytical model to solve the performance indicators of the second stage by using the quasi-birth and death process and rate matrix. • We fully demonstrate the influence on TOC performance of different parameters such as the number of SOPs, the number of SOPs allowed to be on vacations and vacation rate by numerically simulating, and make an analysis of the reasons for the results. • Finally, we illustrate the influence on TOC performance of different service models with vacations, including three-staged service model with asynchronous vacation (TSSMAV), four-staged service model with synchronous vacation (FSSMSV) and four-staged service model with asynchronous vacation (FSSMAV), by numerically simulating.
The rest of the paper is organized as follows. Section 2 chiefly describes the background and motivation of this paper, in particular, the performance analysis based on different queueing systems. Section 3 introduces the threestaged service model with vacation queueing after presenting the computing paradigm of TOC. In Sect. 4, we primarily present a TOC task scheduling algorithm with some SOPs allowed to be on asynchronous vacation and a processor allocation algorithm. In Sect. 5, we focus on the construction of the performance analysis and evaluation model. Especially, we build the analytical al model to solve the performance indicators of the second stage, including response time, number of tasks, and OP utilization, by using the quasi-birth and death process and rate matrix. Sections 6 and 7 demonstrate the influence on TOC performance of different parameters and three service models with vacations by numerically simulating. Finally, Sect. 8 gives some concluding remarks and possible future research.

Background and motivation
In the early stage of TOC development, its performance has been concerned by researchers. For example, in 2010, Liu et al. designed and implemented a subsystem for measurement of the response time of optical processor in TOC (Liu et al. 2009). However, it can only measure the response time of the computing component, they ignored the transmission time of the data from client to TOC, the one of computing results from TOC to the client and other performance indicators such as optical processor utilization and so on. In literature Wang et al. (2017), a four-staged TOC service model was established for the first time, and the response time of TOC was analyzed and evaluated based on M/M/1 queuing system. It concluded that the speed of network is the bottleneck of TOC performance. , Wang et al. (2019) also built a fourstaged service model of TOC. However, the model is based on the complex queuing system composed of M/M/1, M/M/n, M X /M/1 and M/M B /1. They made a comparative analysis of the response times under the immediate scheduling strategy and the computing accomplished scheduling strategy. The results show that the latter is obviously superior to the former. Nevertheless, they did not consider the possible need for maintenance of TOC. To better model the computing ecology of TOC, Jin et al. (2005) analyzed and evaluated the performance of TOC based on synchronous multi-vacations. And the results showed that the number of SOPs uniformly partitioned and vacation rate have a great impact on the system performance.
In 2018, a TOC with 1152 trits was built by combining six optical processors (each with 192 trits). It not only shows that TOC has a very good scalability, but also shows that the assumption of synchronous vacation of all SOPs in literature Jin et al. (2005) is not enough to accurately describe the computing ecology of TOC and leads to the waste of processor resources. On the other hand, asynchronous vacation of partial SOPs is more suitable for the practical application of TOC. Consequently, this paper intends to analyze and evaluate the performance of TOC by taking asynchronous multi-vacations of some SOPs into consideration.
3 Three-staged service model of ternary optical computer

Computing paradigm of ternary optical computer
As can be seen in Fig. 1, the computing paradigm of TOC is Client/Server. The function of Client mainly includes two aspects. On the one hand, it submits the operation request to the Server. In Literature Wang et al. (2020aWang et al. ( , 2011Wang et al. ( , 2017Wang et al. ( , 2019 and , Client resolves the operation into binary three-valued logical operations needed to implement the operation when users click on ''Submit'' button. Meanwhile, it obtains the number of logical operations, the computation amount of each logical operation and the total computation amount and send the operation request to the Server after transforming operands input by user into internal communication codes. And then, the preprocessing module in Server changes the internal communication codes into internal control codes which is used to implement optical computing. In this paper, the Client can be also used to improve the user experience and system performance in addition to providing input interface for users to enter operation and data. In other words, the preprocessing module is deleted, as shown in Fig. 1. Thus, the operands in operation request are not in internal communication code but in internal control code. On the other hand, the Client displays the results sent by the Transmitter in the Server when it receives them. Server is made up of Host computer (HC) and Slave computer (SC), and the HC mainly consists of four modules, including Receiver (R), Scheduler (S), Manager (M) and Transmitter (T). Receiver receives the task, i.e. operation request sent by the Client; Scheduler schedules the tasks according to certain strategy; Manager primarily allocates and recovers the SOPs of TOC; and Transmitter sends the operation results to the corresponding Client. In addition, the SC mainly consists of three modules, including optical encoder (OE), OP and its reconfigurable component, and optical decoder (OD). These modules coordinate with each other and organically form a whole TOC system to achieve the user's calculation.
3.2 Three-staged service model of ternary optical computer Literature Wang et al. (2020a,  all constructed a four-staged TOC service model with vacations. According to the description above, this paper has built a three-staged TOC service model with vacations, as shown in Fig. 2. It can be seen that the model is connected from Stage 1 to Stage 3 in series. Moreover, each stage has its own queue. They are the receiving queue (RQ), scheduling queue (SQ) and transmitting queue (TQ) in turn. We assume that all queues are blocked request delay and the queuing rules of all queues are first-comefirst-served (FCFS). The main function of each stage is as follows.
• In Stage 1, operation requests submitted by users arrives to the Receiver R at an average arrival rate k. R puts them into the RQ by the FCFS strategy. And then, it takes out the operation requests in the receiving queue RQ in turn when it is not empty, and sends them to the Scheduler S. Thus, the operation requests become tasks in TOC. • Stage 2 consists of more functional modules, including the Scheduler S, Manager M and SC. The Scheduler S inserts the received tasks into the scheduling queue SQ in turn, and then schedules the tasks in SQ by some scheduling strategy. At the same time, it sends them to the optical encoder OE in SC. The Manager M allots a SOP to each of the just-scheduled tasks, allocates trit resources for each binary three-valued logic operation needed of each task. In the meantime, it finds the reconfiguration codes of the binary three-valued logic operations and sends the allocation information and reconfiguration information to the OP in the SC. The OE converts the electrical signals into optical signals, i.e. HPL, VPL and NIL, after receiving the data represented in internal control code. After the reconfiguration unit in the OP finishes the OP reconfiguration in full parallel by use of the reconfiguration codes, the OP can implement the optical computing when the optical signals generated by the OE pass through it. The optical decoder OD decodes the operation results, i.e. converts the optical signals into the electrical ones in internal control code, and then judges whether they are the final operation results. If so, they are sent to the Transmitter T in the HC; otherwise, they are sent to the encoder to participate in the next operation. On the other hand, the SC will start a vacation with random time according to a certain strategy for the SOP which just finishes a task and send the ''vacation'' signal to the Scheduler S if there is no task to calculate. It sends the ''vacation is over'' signal to S when the vacation is over. S schedule a task in SQ once again if there is a task in SQ; otherwise, the SOP goes on its next vacation. • The Transmitter T in Stage 3 puts the received operation results into the TQ in turn, and then it takes out the operation results in TQ and sends them to the corresponding Client according to FCFS policy.
The three stages form a tandem queuing model. Compared with literature Wang et al. (2020aWang et al. ( , 2017Wang et al. ( , 2019 and , the computing paradigm of TOC proposed in this paper has deleted the preprocessing module in the Server and the service model has only three stages. Therefore, we assert that this model will result in a significant performance improvement over the previous service models.  Fig. 2 The three-staged TOC service model with vacations 4 Task scheduling and optical processor management algorithm with asynchronous vacations of some SOPs Considering the unique characteristics of TOC optical processor, such as numerous trits, extendibility, bitwise allocability and dynamical bitwise reconfigurability and so on, this paper similarly divides the entire optical processor into several SOPs for standbys according to the equal partition strategy. We assume that there are N trits and divide the consecutive trit resource into n SOPs which can be used dependently. And each of them has N s = N/n trits. Under the asynchronous vacation strategy, it is assumed that the maximum of SOPs allowed to be on asynchronous vacations is c. Obviously, each SOP has three states, including ''idle'', ''busy'' and ''vacation'', denoted by ''0'', ''1'' and ''2,'' respectively. The Scheduler S executes the task scheduling algorithm shown in Algorithm 1 and the Manager M fulfills the processor allocation algorithm and recovery algorithm, shown in Algorithms 2 and 3, respectively.
Algorithm 1 Task scheduling algorithm with asynchronous vacations of some SOPs • Step 1: Initialize the system parameters. Set the scheduling queue SQ to null. Meanwhile, set the length L SQ of SQ, the number d of SOPs in vacation state and the number N pring of tasks being processed all to 0 • Step 2: When a task arrives, S inserts it in SQ by FCFS policy, increases L SQ by 1, and jump to Step 3 • Step 3: Judge whether there is an idle SOP. In other words, judge whether N pring is less than n-d. If so, jump to Step 4; otherwise, jump to Step 9 • Step 4: Schedule a task, i.e. send it to the SC, increase N pring by 1, decrease L SQ by 1, send the relevant information on the task to the Manager M, and jump to Step 5 • Step 5: Judge whether L SQ is equal to zero. If so, jump to Step 9; otherwise, jump to Step 3 • Step 6: Decrease N pring by 1 and jump to Step 7 when S receives the ''i finished'' signal sent by the SC, which means that the task on the i-th SOP is finished • Step 7: Judge L SQ is greater than zero. If so, jump to Step 3; otherwise, judge whether d is equal to c. If so, send ''i0''signal to M; otherwise, increase d by 1 and send ''i2''signal to M. Jump to Step 9 • Step 8: Decrease d by 1 and jump to Step 7 when S receives the ''Vacation over'' signal • Step 9: The algorithm is over As can be seen from Algorithm 1, not until there is no idle SOP or no task in SQ does S stop scheduling task. M carries out the recovery of processor resource when a task is finished and L SQ is equal to zero. There are two alternatives for the SOP that just finished the task: it starts a vacation, that is to say, its state ''1'' is change into ''2'' if d is less than c; it becomes idle if d is equal to c. In other words, only when a SOP finishes a task, there is no task to be scheduled and d is not equal to c can it can start a vacation.
Algorithm 2 Processor allocation algorithm • Step 1: Initialize the system parameters. Set the states S[0…n -1] of all SOPs to zero. j = 0 (j is used to point to the SOP to be scheduled) • Step 2: j = j mod n, and judge whether S[j] is equal to zero. If so, S[j] = 1 and jump to Step 4; otherwise, jump to Step 3 • Step 3: Increase j by 1 and jump to Step 2 Step 5: Judge whether k is greater the number N Log of binary threevalued logic operations needed by the task. If so, jump to Step 7; otherwise jump to Step 6 is the computation amount of the -th logic operation, and C = P NLog k¼1 C k .
Increase k by 1, jump to Step 5 • Step 7: The algorithm is over It can be seen that the algorithm still uses the proportional allocation strategy to allocate the trit resources of the SOP as literature Wang et al. (2020aWang et al. ( , 2017Wang et al. ( , 2019 and  do, focusing on the synchronous completion of all binary three-valued logic operations in the task.
Algorithm 3 Processor recovery algorithm • Step 1: Obtain ''i'' and state ''0''or ''2'' by splitting the information received from S and assign them to I and s, respectively • Step 2: S[I] = s. Judge whether s is equal to zero. If so, jump to Step 3; otherwise, send ''2'' to the SC • Step 3: The algorithm is over It can be seen that Algorithm 3 achieves the recovery of SOPs by setting the states to ''0''or ''2''. In other words, it not only recoveries the idle SOPs but also recoveries the SOPs on vacations.

Analytical model for performance analysis and evaluation
To analyze and evaluate the performance of TOC, we choose some primary performance indicators such as response time, the number of task and the OP utilization in the TOC. They are denoted as T, R and U, respectively. In addition, the response time is defined as the elapsed time from the submission of a request to the TOC until the Client receives the final output, i.e. the time it is serviced. It can be obtained via the following formula when the system is in equilibrium.
where T i (i = 1, 2, 3) denotes the mean service time of Stages 1-3 shown in Fig. 2 and the service time is the sum of waiting time and computing time. Similarly, R is the sum of mean of tasks in each stage. Namely, where R i (i = 1, 2, 3) represents the mean of tasks in Stages 1-3.

Analytical model for the Receiver
We can use a server to implement the function of the Receiver R, that is, to receive the operation request sent by the user. For the sake of simplification, we model Stage 1 as an M/M/1 queuing system with single request arrival and a request buffer of infinite capacity. The specific model is described as follows.
• Assume that the arrival of the requests follows a Poisson distribution, that is to say, the arrival rate follows a negative exponential distribution with parameter k. • The service times of R for the operation requests are independent and identically distributed random variables that follow a negative exponential distribution with parameter l 1 , which denotes the service rate of R. • Service mechanism for R is FCFS policy.
• Denote the mean transmission speed of network and the mean traffic of requests that are submitted to R as n and D, respectively.
Thus, the state transition diagram of the continuous time Markov chain (CTMC) for the number of requests in Stage 1 is shown in Fig. 3, where the state m means the request number in R and there are m -1 requests in RQ to be received.
Let q 1 ¼ k=l 1 . (Gross et al. 2008;Kleinrock 1975) shows that R will reach an equilibrium state when q 1 \1. Denote the probability of state m under its equilibrium as p m (m = 0, 1, 2, …). According to Fig. 3, the steady-state equilibrium equation of the RQ system can be obtained as follows.
We can obtain the idle probability p 0 of the Receiver R, i.e. the probability of no request in R by use of the normalization equation P 1 m¼0 p m ¼ 1: Then, the mean number of requests R 1 can be obtained We can obtain the mean service time T 1 of the RQ by Little's law (Gross et al. 2008;Kleinrock 1975), i.e. the service time = the number of requests in the system/the arrival rate of requests.
where k and l 1 are the arrival rate of requests and the mean receiving rate in unit time, respectively. Obviously, l 1 ¼ n D : We can obtain the following formulas of R 1 and T 1 by substituting it into (3) and (4).

Analytical model for Stage 2
The output of R is also a Poisson process with the parameter k while it reaches equilibrium on the basis of Burke's theorem (Gross et al. 2008;Kleinrock 1975). In other words, the arrival rate of tasks to Stage 2 is still k. As described in Sect. 3, the whole OP of the TOC is uniformly • Denote the maximum of SOPs allowed to be on asynchronous vacations as c. Obviously, 1 B c B n. • Assume the mean computation amount, denoted as C, and C = mD, where m is a constant greater than 1. • Denote the speed of the whole OP as r. Assume that the computing times of TOC for the tasks are independent and identically distributed random variables which follow a negative exponential distribution with parameter l 2 , which means the service rate of the whole OP. That is to say, l 2 ¼ r mD : And the service rate denoted as l 2s of each SOP l 2s ¼ r= mnD ð Þ. • The vacations times of SOPs are independent and identically distributed random variables which follow a negative exponential distribution with parameter d, which means the vacation rate of the SOPs. • L v (t) and V(t) indicate the number of tasks and the number of SOPs on vacation at time t in Stage 2 in equilibrium, respectively. • The reconfiguration time is ignored because the full parallel reconfiguration takes very little time. • The random variables l 2 ; d and k are independent of each other.
Let q 2 ¼ k=l 2 . Stage 2 will reach an balance state when q 2 \1. According to the description above, at any time t, the number of SOPs on vacations V(t) B c, and there are n Àd idle SOPs for customers to be used. When 0 L v t ð Þ n À c, the number d of SOPs on vacations reaches its maximum c, L v t ð Þ SOPs are busy and the remainder are idle. When n À c\L v t ð Þ n, there are at least n À L v t ð Þ SOPs that are on vacation and no idle SOP.
When L v t ð Þ [ n, there is no idle SOP. However, there may be SOPs which are on vacations. Now we use the quasi birth-death (QBD) process (Tian and Li 2000;Neuts 1981) to obtain the task number R 2 and service time T 2 of Stage 2 and the utilization U of OP in equilibrium. Thus, {(L v t ð Þ; V t ð ÞÞ} constitutes a two-dimensional Markov process and a QBD process. And its state space X is as follows.
The states will change when a task is inserted into the scheduling queue SQ, a task is completed and a SOP vacation ends. For example, the state transition mechanism of n = 6 and c = 3 can be obtained by sorting the states by level, as shown in Fig. 4.
The four layers from bottom to top in Fig. 4 represent 0, 1, 2, 3 SOP(s) on vacations, respectively. For example, the state (7, 2) indicates that there are currently seven tasks, two SOPs on vacations and the other four busy SOPs in Stage 2. The state will transfer to three states, including (8, 2), (6, 1) and When a task enters the SQ, it will transfer to the state (8, 2) at the arrival rate of k; the SC, i.e. TOC operates for four tasks at the service rate of 4l 2s , so the state (7, 2) will transfer to state (6, 2) at the rate of 4l 2s ; the state (7, 2) will transfer to the state (6, 1) at the rate of 2d if one of the vacations of the two SOPs is over. The state (5,1) indicates that there are currently five tasks in Stage 2 and no task in SQ, with one SOP on vacation and the five busy SOPs. The state (5,1) will transfer to the state (4,2) at the rate of 5l 2s when a task is completed at the service rate of 5l 2s : We can obtain the following infinitesimal generative element matrix G of each state to the left of the dashed line in by sorting them from left to right and from top to bottom.
And G can be written as a partitioned triangular matrix.
Performance analysis and evaluation of ternary optical computer based… 4113 Þ ÂkÀnþcþ1 ð Þ n À c\k n; 8 > > > > < > > > > : C ¼ kI; I is the identity matrix of order c ? 1. And A and B are the following square matrixes.
of the following matrix equation is called rate matrix (Neuts 1981;Tian and Zhang 2006). Where r kk (0 k c) of R is the root in (0, 1) of the following equation and r cc ¼ q 2 \1. In addition, the off-diagonal elements of R satisfy the following equation Let (L v , V) represent the steady-state limit of the process (L v (t),V(t)), and Thus, the distribution P k of (L v , V) can be denoted as 0 k n À c; Ka k ¼ K a kc ; a k;cÀ1 ; . . .; a k;nÀk À Á ; n À c þ 1 k n; Ka c R kÀc ; n\k where K is the constant factor, and K ¼ P a i e þ a n I À R ð Þ À1 e À1 ; e is a column vector whose elements are all 1, and a 0 ; a 1 ; . . .; a nÀc ; a nÀcþ1 ; . . .; a n is the positive solution of the following equations where p k ¼ ðp kc ; p k;cÀ1 ; . . .; p k;nÀk Þ; n À c þ 1 k n; and According to the formula (10), the joint distribution p kj of (L v , V) can be obtained when the system reaches equilibrium. And R 2 can be get by use of the following formula Similarly, the mean service time T 2 of the SQ can be calculated by Little's law, i.e. T 2 ¼ R 2 k : Moreover, the utilization U of OP can be obtained by the following formula Finally, we can obtain the mean V of SOPs which are on vacations by the following formula

Analytical model for Stage 3
According to the literature Tian and Zhang (2006), the task arrival rate in TQ is equal to the output of Stage 2, i.e. k. Assume that the mean traffic from the transmitter T to the corresponding client is D/2. Meanwhile, we model this stage as an M/M/1 queuing system. Let q 3 ¼ kD 2n . Similarly, we can obtain R 3 and T 3 by the following formulas when q 3 \1.
We can obtain the number R of tasks and the response time T of the whole system by substituting R 1 , R 2 , R 3 into (1) and T 1 , T 2 , T 3 into (2).

Performance analysis and evaluation by numerical simulation
To verify the correctness and effectiveness of the proposed model for system performance analysis, we conduct some experiments by simulating numerically on the above model. Meanwhile, this paper analyzes the influence of the parameters such as the request arrival rate k, the maximum c of SOPs on vacations and vacation rate d on system performance. In addition, we consider the impacts of different vacation models on TOC performance.

Parameter settings
Task arrival rate k 2 f0:002ij1 i 20; i 2 Ng, the average transmission speed of network n ¼ 50 MB/s, the mean traffic of tasks D = 0.5 GB, optical computing speed of TOC, i.e. OP r = 5 GB/s, the number of SOPs n = 6, the maximum of SOPs allowed to be on vacations c = 2, vacation rate of SOPs d ¼ 0:1 and the times of computation amount compared with the mean traffic m = 200. Certainly, these parameters are illustrative, that is, they can be modified.

Influence on system performance of arrival rate
The experimental results are shown in Table 1 and Fig. 5 by numerically simulating the proposed model by means of the above parameters. T 1 and R 1 , T 3 and R 3 all increase linearly with the increase of k; as can be seen in Fig. 5a, c, e, g, respectively. The reason is that they are all increasing functions of k, as shown in formulas (5) and (13). Similarly, T 2 , T and V all decrease with the increase of k; as shown in Fig. 5b, d, i, respectively. The main reason is that the high task arrival rate can result in the state change from ''vacation'' to ''busy'', that is, the number of SOPs on vacations decreases with the increase of k, as shown in Fig. 5i. In particular, V gradually reduces from 2.0000 to 1.3827 when k increases from 0.002 to 0.032. We can think that there is one SOP whose state changes from ''vacation'' to ''busy''. Thus, the increase of busy SOPs inevitably leads to a decrease of T 2 . On the other hand, as shown in Fig. 5a-c and Table1, T is mainly affected by T 2 , so it decreases with the increase of k. Most interestingly, R 2 , R and U all tend to increase first and then decrease with the increase of k, as shown in Fig. 5f, h, j, respectively. The reason is that low task arrival rate does not make the states of SOPs change from ''vacation'' to ''busy'', causing R 2 to increase with the increase of k. When k reaches a certain threshold, about 0.032, one of the two SOPs on vacations changes its state from ''vacation'' to ''busy'' after its vacation is over, which makes R 2 gradually reduce. Similarly, R is mainly influenced by R 2 , as shown in Fig. 5e-g, so it also tends to increase first and then decrease with the increase of k. According to formula (12), U is a function of R 2 when n is determined, so it, like R 2 , tends to increase first and then decrease. In short, T tends to decrease with the increase of k, while R and U tend to increase first and then decrease.

Influence on system performance of the number of SOPs
For the parameters in Sect. 6.1, we consider the system performance indicators including response time T, task number R and OP utilization U when n is equal to 4, 5 and 6 and the other parameters are unchanged. The relevant results are shown in Table 2 and Fig. 6. We give a special explanation about OP utilization. The SOPs differ in size after the whole OP is uniformly divided into 4, 5 and 6 parts. Therefore, U needs to be modified to obtain the real OP utilization when n is equal to 4 and 5. The correction formula for the utilization is as followed.
where U r is the OP utilization after being corrected and r is the correction coefficient. r is equal to 1.5 and 1.2, respectively when n is 4 and 5.
As can be seen from Table 2 and Fig. 6, the response times T with different number of SOPs show the same trend with the increase of the task arrival rate k and the response times for each arrival rate increase with the increase of n. And the task numbers and OP utilizations tend to increase first and then decrease with the increase of arrival rate. Meanwhile, the task numbers for each arrival rate increase with the increase of n while the OP utilizations reduce. The reason is that the processing speed of the whole OP is unchanged and the processing speed of each SOP is larger when n is smaller. In other words, the reduction of n will lead to an improvement in system performance.

Influence on system performance of the maximum of SOPs allowed to be on asynchronous vacations
For the parameters in Sect. 6.1, we similarly consider the system performance indicators including response time T, task number R and OP utilization U when c is equal to 1, 2 and 3 and the other parameters are unchanged. The relevant results are shown in Table 3 and Fig. 7. As can be seen from Table 3 and Fig. 7, T, U and R show the same trend with increase of k when c is assigned with different values. That is, they all increase first and then decrease. However, R and U both increase more significantly. When c is equal to 1 and 2, the values of T, R and U for a certain arrival rate are basically identical, respectively. The performance indicators of c equal to 3 are significantly lower than those of it equal to 1 and 2 when k [ 0.028, and the response time T of c equal to 3 is significantly higher than that of the other two when k \ 0.028. The reason is that when the lower arrival rate doesn't change the states of the three SOPs on vacation, i.e. make one of them run. And these SOPs enter running state one by one with the increase of k. At the same time, all of the SOPs can be maintained when they are on vacation, Table 2 The system performance indicators when the number n of small OPs is 4, 5 and 6, respectively k n = 4 n = 5 n = 6

Influence on system performance of vacation rate
For the parameters in Sect. 6.1, we similarly consider the system performance indicators including response time T, task number R and OP utilization U when d is equal to 0.000 1, 0.001, 0.01, 0.1 and 1.0 while the other parameters are unchanged. The relevant results are shown in Table 4 and Fig. 8.
As can be seen from Table 4 and Fig. 8, vacation rate d has little or no effect on system performance when task arrival rate k is low, for example, k is less than 0.020; when k is high, vacation rate has a significant effect on the response time T and task number R. In other words, both of them for each arrival rate remarkably decrease with the decrease of d: The main reason is that smaller vacation rate means fewer vacations per unit time. Obviously, the lower vacation number makes the tasks just arriving be processed timely, improving system performance. In addition, as can be seen from Table 4 and Fig. 8c, the OP utilization U is improved when d is 0.1 and the task arrival rate is higher; the utilization curves for the other values of d is essentially coincident since the utilizations for each arrival rate are  identical. As a result, small vacation rate can improve system performance.

Performance comparison under different vacation models
In this section, we shall compare the main system performance indicators such as response time T, task number R, and OP utilization U under different vacation models. These models include TSSMAV proposed in this paper, FSSMSV proposed in Literature Wang et al. (2020a) and FSSMAV obtained by replacing the synchronous vacation in Literature Wang et al. (2020a) with the asynchronous vacation in this paper. The parameters in Sect. 6.1 are all unchanged and a parameter s, i.e. the speed of electronic computer is added. Let s = 3 GB/s. And the relevant results are shown in Table 5 and Fig. 9. It can be easily seen that T first reduces slightly and then increases with the increase of k while R and U under FSSMSV all increase. Meantime, response times under TSSMAV and FSSMAV both decrease with the increase of k while R and U under TSSMAV and FSSMAV both have the tendency to increase first and then decrease. Moreover,  Table 5. Compared with TSSMAV, FSSMAV is added the data preprocessing stage so that the R and T under the latter are slightly higher than those of the former. In particular, the increasing trend of U under FSSMSV does not mean that FSSMSV can improve system performance. On the contrary, it will degrade the system performance. The too high utilization will result in a higher risk of downtime because the system will not be maintained effectively at high utilization. Of particular importance, each of the three indicators for each arrival rate under FSSMSV is greater than the other two and each of the three indicators for each arrival rate under TSSMAV is not more than the other two. Therefore, TSSMAV outperforms the other two vacation models in the system performance of TOC.

Conclusions and future work
Performance analysis and evaluation of a TOC has been paid more and more attentions of both its customers and its providers. To more accurately model the calculation ecology of the TOC for performance analysis and evaluation of the TOC, this paper built a three-staged service model of the TOC by introducing the asynchronous multi-vacation queuing and tandem queuing. And the vacation is referred to the asynchronous vacations of the SOPs. Moreover, for each SOP, it doesn't start a vacation until it services exhaustively and the number of vacation SOPs is less than the maximum of SOPs allowed to be on vacations. Meanwhile, this paper proposed a task scheduling algorithm with partial SOP asynchronous multi-vacation and an OP management algorithm including an OP allocation algorithm and an OP recovery algorithm. We focused on the building of analytical models for performance analysis and evaluation of the TOC after selecting some primary performance indicators such as the number of tasks, response time and OP utilization. We built some mathematical models for the numbers of tasks and the service times in the first and third stage based on the M/M/1 queuing system. Particularly, we established the analytical models for the performance indicators, the number of tasks, service time and OP utilization in the second stage based on M/M/n queuing system with asynchronous multi-vacation and solved them by introducing the QBD process. Thus, we obtained the number of tasks in the TOC system and the response time by summing them, respectively.
Finally, we obtained the system performance indicators by numerically simulating on these models. And the results showed that the response time drops with the increase of the task arrival rate while the number of tasks and the OP utilization increase first and then decrease. And the smaller the number of SOPs is and the smaller the vacation rate is, the higher the system performance is. In addition, the maximum of SOPs allowed to be on vacations has an important impact on the system performance. Moreover, if the number of SOPs is 6, the performance is optimal when the maximum of SOPs allowed to be on vacations is 2 and the arrival rate is not greater than 0.028 or when the maximum of SOPs allowed to be on vacations is 3 and the arrival rate is greater than 0.028. At the same time, the system performance was compared under different vacation models, such as FSSMSV, TSSMAV and FSSMAV. The results showed that the performance under TSSMAV is superior to the other two. Therefore, compared with synchronous vacations, asynchronous vacations can not only make the system better maintained but also improve system performance to some extent. The next step is to study how to optimize the parameters to achieve the optimal TOC performance and further improve user experience.