A Quantum Online Portfolio Optimization Algorithm

Portfolio optimization plays a central role in finance to obtain optimal portfolio allocations that aim to achieve certain investment goals. Over the years, many works have investigated different variants of portfolio optimization. Portfolio optimization also provides a rich area to study the application of quantum computers to obtain advantages over classical computers. In this work, we give a sampling version of an existing classical online portfolio optimization algorithm by Helmbold et al., for which we in turn develop a quantum version. The quantum advantage is achieved by using techniques such as quantum state preparation, inner product estimation and multi-sampling. Our quantum algorithm provides a quadratic speedup in the time complexity, in terms of $n$, where $n$ is the number of assets in the portfolio. The transaction cost of both of our classical and quantum algorithms is independent of $n$ which is especially useful for practical applications with a large number of assets.


Introduction 1.Online optimization
Online optimization is a branch of optimization, where the input data is revealed over time and decisions have to made while having incomplete knowledge about the input data.At every time step, a loss function will be given based on the decisions made so far.A feature of online optimization is that the sequential input can be given in an adversarial manner; the provable guarantees hold even if the input is chosen by an adversary who knows the algorithm's strategy.Online convex optimization studies the problem of optimizing a convex function over a convex set in an online fashion.The popular first-order algorithms for online convex optimization include variants of gradient descent, mirror descent and coordinate descent [1,2,3,4,5].
Apart from the commonly known gradient descent method, the multiplicative weight update method is another alternative to solving optimization problems.The multiplicative weight update method is a primal-dual algorithm proposed by Arora and Kale [6], which assigns an initial weight to each expert and at every iteration, updates the weights according to the experts' performances.This algorithm can also be extended to the online convex optimization framework when the convex set is the n-dimensional simplex.The multiplicative weight update method is one of the second-order methods in online convex optimization besides the Newton's method [7], which iteratively finds the roots of a differentiable function.Some applications of the multiplicative weight update method include solving linear programs and semidefinite programs [8], learning algorithms [9], and portfolio selection [10].
In zeroth-order online convex optimization (bandit convex optimization), the feedback is in the form of a real number (instead of a loss function), thereby being less informative.The first algorithm for bandit convex optimization was proposed by Flaxman et al. [11].Subsequently, many follow up works [12,13,14] have been done to improve the regret bound.

Portfolio optimization
Portfolio optimization is a standard problem in mathematical finance.The first formalization, the Markowitz (mean-variance) model, is proposed by Nobel prize winner, Harry Markowitz [15].It is a single-period unconstrained quadratic programming problem, which either maximizes the portfolio return for a given level of risk or minimizes the risk for a given return.However, there are several caveats concerning the implementation of this model.Among them, the model relies on the knowledge of the mean and covariance matrix of the asset returns.Besides that, the model suffers from error maximization, i.e., a small change in the inputs can result in a large change in the portfolio [16].Consequently, many refinements have been proposed to make the model more realistic [17,18,19,20,21,22,23,24,25,26,27,28].
Reference [10] by Helmbold et al. is a seminal paper discussing a (classical) online algorithm for portfolio selection based on the multiplicative weight update rule.The update rule was derived using a framework introduced by Reference [29] for online regression.The authors adapted this framework to the online portfolio selection setting and the resulting algorithm uses linear (in the number of assets) time and space to update the portfolio vector at each time step.A survey on (classical) online portfolio selection was done by Reference [30] from an online machine learning perspective.The survey paper expressed online portfolio selection as a sequential decision problem and included various classes of related algorithms, such as follow the winner, follow the loser, pattern-matching-base approaches and meta-learning algorithms.

Our work
Our main contribution is an online quantum algorithm for portfolio selection.We show that the online portfolio selection algorithm proposed by Helmbold et al. [10] can be quantized.We adopt a step by step approach to demonstrate how we arrive at the quantum algorithm.We start from Algorithm 1, the slightly extended version of the classical online portfolio optimization algorithm from Reference [10] which includes a transaction cost (Corollary 1).Next, we implement a sampling procedure in Algorithm 2 which renders the transaction cost independent of n (Theorem 2).Subsequently, we build on Algorithm 2 but use an inner product estimation procedure to compute the portfolio vectors in Algorithm 3 (Corollary 2).Lastly, we use quantum inner product estimation and quantum multi-sampling to replace their classical counterparts and use quantum state preparation to prepare the portfolio vector when devising our quantum online portfolio optimization algorithm, Algorithm 4 (Theorem 4).
We summarize our results in the

OpT nq
OpT nCq Table 1: Summary of results.Throughout this work, n is the number of assets, T is the total number of time steps, and r min is the lower bound for the price relatives (see Assumption 3).
In addition, C is the transaction cost (see Assumption 2) and 3δ is an upper bound on the probability of failure.
The regret bound achieved by Algorithm 4 is larger than than that of Algorithm 1 only by a small factor, and the algorithm provides a quadratic speedup in the run time in terms of n, the number of assets in the portfolio.The speedup is due to the use of amplitude amplification, quantum inner product estimation, and quantum multi-sampling.In addition, the algorithm does not have to store the portfolio vectors w ptq explicitly for every time step t.Instead, the portfolio vectors can be computed efficiently via unitaries that perform arithmetic operations.Moreover, the transaction cost of our algorithm is independent of n, which is especially useful for practical applications with a large number of assets in the portfolio.

Related work
References [31,32,33,34] discuss the state-of-the-art, potential, and challenges of quantum computing in finance.Rosenberg et al. [35] discuss a non-convex discrete portfolio optimization problem, in the context of D-Wave's quantum annealer.Their problem formulation aims to maximize the expected return while minimizing the risk and transaction costs.They numerically showed that the quantum annealer in principle could solve this problem with high probability and this success probability can be increased by making adjustments to the annealer.In Ref. [36], the authors proposed a quantum algorithm for the unconstrained portfolio optimization problem.The algorithm uses quantum linear system solvers [37,38] to obtain speedups for portfolio optimization problems that can be reduced to unconstrained quadratic programs, which in turn are reducible to a single linear system.Subsequently, Ref. [39] gave a quantum algorithm for the general constrained portfolio optimization problem with an arbitrary number of nonnegativity and budget constraints, resulting in a polynomial speedup in terms of the number of assets, as compared to the best known classical algorithm when only a moderately accurate solution is required.
In terms of practical implementation, Reference [40] evaluated the experimental performance of using the Quantum Approximate Optimization Algorithm and the Quantum Alternating Operator Ansatz to solve a discrete portfolio optimization problem for a multi-period portfolio rebalancing setting.Subsequently, Ref. [41] numerically showed that the Quantum Walk Optimization Algorithm is capable of achieving a significantly better performance.In the noisy intermediate-scale quantum (NISQ) setting, the work by Ref. [42] proposed a hybrid algorithm for end-to-end execution of small scale portfolio optimization problems on near-term devices.Their algorithm uses techniques such as mid-circuit measurement, quantum conditional logic, and qubit reset/reuse, and also improved on the existing eigenvalue inversion component of HHL.
Online optimization has been considered in the quantum setting.Boosting is an approach to improve the performance of a weak learning algorithm in terms of its accuracy.Quantum boosting was discussed in Reference [43] to improve the time complexity of the widely used classical AdaBoost proposed by Ref. [44].Subsequently, a follow-up work by Ref. [45] was done to provide a significantly faster and simpler quantum boosting algorithm.The Hedge algorithm proposed by Freund and Schapire uses the multiplicative weight update method to adaptively allocate mixed strategies to solve an adversarial online optimization problem.The Sparsitron by Reference [46] which is based on the Hedge algorithm, is a machine learning algorithm for undirected graphical models.Quantum versions of both the Hedge algorithm and the Sparsitron were discussed in Ref. [47].In zeroth-order optimization, there are instances where quantum advantage have been proven.For example, in the multi-armed bandits setting, Ref. [48] proposed a quantum algorithm that provides an exponential speedup in terms of the time T in the regret bound as compared to the well known classical lower bounds [49,50].In bandit convex optimization, Ref. [51] gave an quantum algorithm that achieves a regret bound that is independent of n, the dimension.This outperforms the best known optimal classical algorithm [52].

Notations
We use rns to represent the set t1, ¨¨¨, nu, where n P Z `and denote the i-th entry of a vector v P R n as v i for i P rns.If a vector has a time dependency we denote it as v ptq .Let e i be the vector of all zeros with a 1 in the i-th position.The ℓ 1 -norm of a vector v P R n is defined as v 1 :" We use 0 to denote the all zeros vector and use | 0y to denote the state |0y b ¨¨¨b |0y, where the number of qubits is clear from the context.The maximum entry in absolute value of a vector v P R n is denoted as v max " max iPrns |v i | and we denote the maximum entry of a vector v P R n as v max " max iPrns v i .For v P R n and k P R, k v P R n is the element-wise exponential v, i.e. pk v q i " k v j .We write the natural logarithm (base e) as log.We use Õp¨q to hide the polylog factor, i.e., Õpf pnqq " Opf pnq ¨polylogpf pnqqq.We sometimes use Op1q to denote a constant.

The computational model
We refer to the run time of a classical/quantum computation as the number of basic gates performed.We assume a classical arithmetic model, which allows us to ignore issues arising from the fixed-point representation of real numbers.The basic arithmetic operations take constant time.In the quantum setting, we assume a quantum circuit model.Each quantum gate in the circuit represents an elementary operation, and the application of each quantum gate takes constant time.The time complexity of a given unitary operator U is the minimum number of basic quantum gates required to prepare U .In addition, we assume a quantum arithmetic model, which is equivalent to the classical model in that arithmetic operations take constant time.Our quantum algorithm assumes quantum query access to certain vectors.For the oracles, the representation of real numbers to finite precision is also not taken into account.Given a vector v P R n , we say we have quantum query access to this vector if we have access to the operation O v which performs The second register is assumed to contain sufficient qubits to make all the subsequent computations accurate, in analogy to the sufficient bits that a classical algorithm assumes to run correctly.

The online portfolio optimization framework
Consider T discrete time steps and n assets, and the setting as in Ref. [10].A portfolio of these n assets at time t P rT s is described by a vector w ptq such that for each i P rns, w ptq i ě 0 and Here, we make the no-shortselling assumption, see also below.Each asset has a price as a function of time and in this work we consider the time series of closing prices.The original paper [10] uses the opening prices, and we assume that the closing price at t is the same as the opening price at t `1.Define the day-to-day return as R ptq i :" closing price of asset i on day t closing price of asset i on day t ´1 .
In this work, the performance of the assets is reflected in a price relative vector ρ ptq P R n `, where for all i P rns, ρ ptq i is the ratio where R ptq max " max jPrns R ptq j .By definition, 0 ď ρ ptq i ď 1 for all i P rns and t P rT s.However, we assume a known lower bound r min P p0, 1s such that 0 ă r min ď ρ ptq i for all i P rns and t P rT s.Given w and ρ, an investor's wealth changes by a factor of w ¨ρ " from one trading day to the next.In the online portfolio selection setting, the learning algorithm has access to the price relative vectors ρ p1q , ¨¨¨, ρ ptq at the end of trading day t.The algorithm then selects the portfolio w pt`1q for the next day.At the end of each trading day t, ρ ptq is revealed and the investor's wealth changes by a factor of w ptq ¨ρptq .As time progresses, ρ p1q , ¨¨¨, ρ pT q will be revealed and w p1q , ¨¨¨, w pT q will be selected.From the start of trading day 1 through the start of trading day pT `1q, the wealth changes by a factor of S " Similar to the analysis in Reference [10], we will deal with the normalized logarithm of S: since wealth often grows or decays geometrically in typical markets.
Consider the "offline gain", LS ˚" 1 log ´w ¨ρptq ¯, which is the maximum gain in wealth achievable when choosing the same portfolio w ˚" arg max tw: w 1 "1u Spwq " arg max LSpwq for all trading days t P rT s.The difference of offline loss and the loss of some sequence of normalized w ptq is called regret.Formally, it is LS ˚´LS and can be naively bounded as LS ˚´LS ď log ´1 r min ¯.The bound follows from This bound does not decrease with T .A main result of the work by Reference [10] is a sequence of w ptq which shows a regret bound of about 1{ ?T .We would like to emphasize again the following assumptions.
Assumption 1.We assume that there is no short-selling throughout the trading period.Therefore, w ptq i ě 0 for all t P rT s and i P rns.
To model the cost of trading, we assume a fixed transaction cost per investment.This cost will highlight the difference between the standard and the sampling algorithm.It was mentioned as a possible extension in Reference [10].
Assumption 2. We assume that a transaction cost of C ě 0 is incurred when investing in a single asset.Here, this transaction cost is independent of the amount of asset that is bought.
The following assumption is as in the classical work and simplifies the analysis of the regret bound.Reference [10] also relaxes this assumption and provides a different regret bound for the relaxed setting.Assumption 3. We assume a known lower bound r min P p0, 1s for the price relatives, i.e., r min ď ρ ptq i ď 1 for all i P rns and t P rT s.

Helmbold et al.'s algorithm
In Reference [10], the authors provide a online algorithm for portfolio optimization.Given a current portfolio w ptq , consider the following optimization problem max tw pt`1q : w pt`1q 1 "1u η ˜log ´wptq ¨ρptq ¯`ρ ptq ¨`w pt`1q ´wptq wptq ¨ρptq ¸´d ´wpt`1q , w ptq ¯. ( The problem is to pick a portfolio vector w pt`1q that maximizes the gain and at the same time, is close to the portfolio vector picked in the previous iteration.Here, η is the "learning rate" and d `wpt`1q , w ptq ˘is a distance measure between w pt`1q and w ptq .We formally define the update rule which is the solution to Eq. ( 10) when the relative entropy is used as the distance measure.
Definition 1 (Exponentiated gradient EGpηq update [10]).With η ą 0, w P R n and ρ P R n , we define by EGpη, w, ρq P R n the mapping which performs the following weight update for all i P rns: where Z " In order to solve the portfolio selection problem, Reference [10] gave Algorithm 1, which uses linear time and space (in n) to update w ptq for each t.We present a slightly extended version of their algorithm by including the transaction cost for investing in an asset.

Algorithm 1 Online Portfolio Optimization Algorithm
Invest in all assets according to w ptq , with cost C for each asset.

4:
Wait until end of day.

6:
for i " 1 to n do  The following theorem by Reference [10] bounds the difference in wealth gained when using a fixed portfolio vector versus the update rule Def.(1) applied to portfolio vector initialized to be the uniform vector `1 n , ¨¨¨, 1  n ˘P R n .

Theorem 1 ([10]
).Let u P R n `be a portfolio vector, and let ρ p1q , ¨¨¨, ρ ptq be of price relatives with max iPrns ρ ptq i " 1 and ρ ptq i ě r min ą 0 for all i P rns, t P rT s, where it is assumed that r min is known.Set w p1q " `1 n , ¨¨¨, Thm. 1 implies the corollary below.
Corollary 1 (Guarantee and run time of Algorithm 1 [10]).Algorithm 1 with η " 2r min b 2 log n T achieves with a total run time of OpT nq and a transaction cost of OpT nCq.
Compared to the naive bound LS ˚´LS EG ď log ´1 r min ¯, this bound decreases with T and is better when T ě log n 2r 2 min log 2 ´1 r min ¯.

Sampling-based online portfolio optimization algorithm
We now consider including a sampling procedure, which leads to a reduction in the total transaction cost as we only invest in the sampled assets.
Fact 1 (ℓ 1 -sampling [53,54]).Given a probability vector p P r0, 1s n , there exists a data structure that samples the index i P rns with probability p i which can be constructed in Opnq time.The time required for obtaining one sample is Op1q.
The assumptions of Ref. [53] allow us to omit logpnq factors in the time for construction and sampling, and we adopt the same assumption in this work.Based on this data structure, we construct Algorithm 2. This algorithm only samples multiple assets from the portfolio vector and invests only in those assets.For the portfolio update, however, the complete vector of price relatives is used and the complete new portfolio vector is computed.The following theorem gives a upper bound on the regret of the logarithmic wealth obtained from sampling from the exponential gradient update.
with success probability at least 1 ´2δ.The total run time is Proof.Let s P Z `.Sample i ℓ P rns with probability w ptq i ℓ for all l P rss.Define the random variable Z ptq " 1 s s ÿ ℓ"1 ρ ptq i ℓ .Then, its expectation is, using the shorthand notation Prepare sampling data structure for w ptq using Fact 1.

5:
Invest the amount 1{s in each asset i ptq 1 , ¨¨¨, i ptq s at cost C each.

6:
Wait until end of day.

8:
for i " 1 to n do

Using
Hoeffding's inequality (see Fact 3) with q ą 0, we obtain when we set q " b 1 2T and s " 2T p1 ´rmin q 2 log T δ .For the success probability, we hence obtain 1 ´2δ T .Now we bound (by Lipschitz continuity) ď 1 T r min (by Eq.( 16) with probability 1 ´2δ by the union bound.Therefore, the regret is bounded by which holds with probability at least 1 ´2δ.
The performance of the algorithm worsens slightly in three regards.Firstly, we obtain a constant factor to the regret bound.Secondly, the algorithm is probabilistic and we obtain a log `1 δ ˘dependence in the run time, where 2δ is the failure probability of the algorithm.Usually, this failure probability can be taken as some small constant such as 0.001 to obtain a 99.9% confidence that the algorithm ran correctly.Thirdly, we obtain a T 2 dependence in the run time due to the multi-sampling step.The benefit of the algorithm is that the transaction cost is reduced from T nC to O `T 2 C log `T δ ˘˘.

Convergence theorem for erroneous updates
Before we move to an approximate classical algorithm and our quantum algorithm, we generalize the convergence result from the original work.The generalizations are in terms of the availability of the inner product and the normalization factor, both of which will be known only approximately in the quantum algorithm.The generalization is embodied in the following definition of an erroneous update rule.
The main theorem for this update rule is as follows, for which we modify the proof of Theorem 1 from Ref. [10].
Theorem 3 (Main convergence theorem for erroneous updates).Let u P R n `be a portfolio vector, and let ρ p1q , ¨¨¨, ρ ptq be price relatives with max iPrns ρ ptq i " 1 and ρ ptq i ě r min ą 0 for all i P rns, t P rT s.With w p1q " `1 n , ¨¨¨, when ǫ I " 3η 4r min and ǫ Z " 0, and when ǫ I " 3η 4r min and ǫ Z " η 2 r 2 min .

Classically-sampled inner product
We show an algorithm where we classically sample the inner product.This algorithm does not offer any reduction in the run time, and the transaction cost is the same as in Algorithm 2. We present the algorithm to provide a gradual transition to the quantum algorithm, as the correctness analysis will be similar for the quantum algorithm.First, restate a lemma on classicallysampled inner products as follows.We would like to highlight that for all t P rT s, the strategies w ptq are the same for Algorithm 1 and Algorithm 2, but, due to this erroneous update, are different for the following Algorithm 3 and Algorithm 4. The following lemma estimates inner products with relative error.As in our portfolio setting, the vector x has a lower bound for its entries.
Lemma 1 (Inner product estimation).Let δ P p0, 1q.Given query access to x P rx min , 1s n and ℓ 1 -sampling access to a probability vector p P r0, 1s n , we can determine, with success probability at least 1 ´δ, the inner product α :" p ¨x P rx min , 1s to multiplicative error ǫ I ď x min , with O ´1 ǫ 2 I x min log 1 δ ¯queries and samples, and Consider the additive version of this lemma as given in Reference [55,47,56,57], adapted to the ℓ 1 case: Let X be a random variable with outcome x j with probability p j .Note that ErXs " ÿ j p j x j " α and V arpXq ď ÿ j x 2 j p j ď α.Apply the median-of-means method [56] on 27 ǫ 2 log 1 δ samples of X to be within ǫ a V arpXq ď ǫ ?α of p ¨x with probability at least 1 ´δ 2 using O `1 ǫ 2 log 1 δ ˘queries.For the multiplicative estimation, run the above algorithm with the precision parameter being set to ǫ " ǫ I .We obtain an estimate r α 1 of the inner product α " p ¨x such that |α ´r α 1 | ď ǫ I ?α, Then, re-run the algorithm with precision ǫ " ǫ I ?r α 1 {2.We will in turn obtain an estimate r α 2 such that This costs O ´1 ǫ 2 I x min log 1 δ ¯queries to obtain the desired guarantee.
with success probability at least Proof.For the guarantee, we use Thm. 2 and Thm 3, and omit more detailed steps here.Aside from the inner product sampling, the run time and transaction cost is the same as in Thm. 2.
A single inner product sampling to accuracy ǫ I " 3 2 b 2 log n T takes time O ´T r min log 1 δ ¯, and is performed T times in the algorithm.A union bound of all steps in the algorithm succeeding and the Hoeffding bound leads to the stated total success probability.
1: Initialize w p1q " p 1 n , ¨¨¨, 1 n q P R n .2: for t " 1 to T do 3: Prepare sampling data structure for w ptq using Fact 1.

5:
Invest the amount 1{s in each asset i ptq 1 , ¨¨¨, i ptq s at cost C each.

6:
Wait until end of day.

11:
end for 12: end for Output:

The quantum online portfolio optimization algorithm
We now present our quantum online portfolio optimization algorithm and its analysis.We change the input assumption to a natural quantum extension of the classical input.The correctness guarantee essentially follows from Theorem 3. We obtain a quadratic speedup in the run time compared to the classical algorithm.Our quantum online portfolio optimization algorithm makes use of the following procedures: quantum state preparation, norm estimation, and inner product estimation.We also employ a multi-sampling algorithm [58] as our subroutine to allow us to sample s elements from a collection of n elements in about ?sn time instead of about s ?n time.Before we present the main algorithm, we will introduce these quantum subroutines.
In the quantum setting, instead of classical access to the price relatives we assume quantum access to the price relatives.The online nature of the problem is given by the fact that we obtain these oracles at the different times.
Data Input 1 (Online gain oracles).Let ρ p1q , ¨¨¨, ρ pT q be price relatives with max iPrns ρ ptq i " 1 and ρ ptq i ě r min ą 0 for all i P rns, t P rT s.Define the unitary U ρ ptq operating on Oplog nq quantum bits such that for all j P rns, U ρ ptq |jy ˇˇ 0

E
. At time t P rT s (end of day), assume access to unitaries U ρ p1q , ¨¨¨, U ρ ptq .
The update rule Eq. ( 1) can be rewritten in terms of a sum over all previous price relatives and inner products, as the following observation shows.Fact 2. Let w p1q " `1 n , ¨¨¨, 1 n ˘.Given the update rule Eq. ( 1), we can express, for t P rT s, i P rns, Given Data Input 1, the following computations can be performed in superposition of the index i P rns for the assets.Similar unitaries were studied in, e.g., References [59,60,61,62].Lemma 2. Let t ď T .Let there be given the set of unitaries U ρ pt 1 q for t 1 P rt ´1s as in Data Input 1, a vector Ĩ P R t´1 , and some reals η, a P R.There exists unitary operators performing the following computations: where q ptq i " exp Ĩpt 1 q ¸to sufficient numerical precision.These computations take OpT q queries and to the data input and requires OpT `log nq qubits and quantum gates.
Proof.With a computational register involving OpT q ancilla qubits for the gains and the ratios, perform Ñ |jy ˇˇρ Ñ |jy ˇˇρ to sufficient accuracy using the oracles and quantum circuits for basic arithmetic operations.
Uncomputing the intermediate registers with additional queries gives us the desired result.In addition, computations of |iy ˇˇq We restate the quantum state preparation, norm and inner product estimation procedure from References [63,60,64,58,47] for the convenience of the reader.
Lemma 3 (Quantum state preparation and norm estimation).Given a vector v P r0, 1s n with max j v j " 1 and quantum access to v. Then: (i) Let ǫ ą 0 and δ P p0, 1q.There exists a quantum algorithm that outputs an estimate z of the ℓ |jy can be prepared with probability 1 ´δ using Op ?n log 1 δ q calls to the unitary of (i) and Õ `?n log p 1 δ q ˘gates.The approximation in ℓ 1 -norm of the probabilities is p ´v v 1 1 ď 2ζ.
Proof.There exists a unitary operator that prepares the state 1 |jy `?v j |0y `a1 ´vj |1y with two queries to the vector, a controlled rotation, and O `log n ˘gates [58].Using this unitary, the proof of part (i) is as in References [63,60,47].For part (ii), note that for i P rns, we have that pi (iv) V " p rnszW 8 with probability 1 ´δ in O pp ?sn `?sn{ǫq log p1{δqq.
The following quantum multi-sampling algorithm allows us to achieves a quadratic speedup in the sampling run time by using amplitude amplification.
Lemma 6 (Quantum multi-sampling algorithm [58]).Let 1 ă s ă n be an integer, 0 ă δ ă 1 be a real number and p P R n be a non-zero vector.Given Γ ą 0, a set W P rns such that p W 1 ď Γ, a value V " p rnszW 8 and query access to p, there exists a quantum algorithm that output s independent samples from p in expected time O `?sn log `1 δ ˘˘with probability 1 ´δ.
We now present our quantum online portfolio optimization Algorithm 4 and its analysis.The correctness guarantee uses Theorem 3. Using Lemma 4 and the other quantum subroutines we obtain a quadratic speedup in the run time compared to Alg. 3. The following theorem gives our main result for the regret and the run time of Algorithm 4.
2: for t " 1 to T do 3: q max Ð Find the largest element of q ptq using U q ptq and quantum maximum finding [65] with success probability 1 ´δ 4T .

8:
Invest the amount 1{s in each asset i ptq 1 , ¨¨¨, i ptq s at cost C each.

9:
Wait until end of day.
with success probability at least 1 ´3δ.
Similar to the analysis of Thm. 2, we obtain with probability at least 1 ´2δ.Using Theorem 3 and Eq. ( 63), the regret is bounded by with success probability at least 1 ´2δ.By the union bound, the total success probability is at least 1 ´3δ by taking into account the success probability of the algorithm.For the run time, consider that U q ptq costs OpT `log nq by Lemma 2. Using the values for η, ǫ I and ǫ Z , the total run time is OpT pquantum maximum finding `quantum norm estimation `quantum state preparation `quantum inner product estimation `quantum multi-samplingqq

Discussion and conclusion
The online setting is more general than the offline setting in that it allows for the input to be given sequentially, where the sequence could be chosen adversarially.The adversarial property implies that the inputs could be selected with the knowledge of the present state of the algorithm, say, with the knowledge of our portfolio vector.The regret bounds hold nevertheless, also in the quantum setting.This online setting is rather natural for certain portfolio optimization situations, where the investment strategy can be inferred by other market participants from transaction data.Online algorithms in the portfolio optimization context have been studied in practice in References [10,66,67,68].
We have devised a quantum online portfolio optimization algorithm that runs in time Õ ´T 3 ?n r min log `1 δ ˘¯and has transaction cost that is independent on the number of assets.Our quantum algorithm achieves a slightly worse (by a constant factor) regret bound, but is more space efficient as compared to its classical counterpart [10], not considering the space requirement for the input oracles.The classical online portfolio optimization algorithm by Reference [10] uses linear (in terms of the number of assets) time and space to update the portfolio vector in every iteration.In our quantum algorithm, we leverage on the fact that the portfolio vectors can be computed efficiently via unitaries that perform arithmetic operations to save on the space/memory of the algorithm.Nevertheless, the practical implementation of the price relative oracles appears to be a bottleneck for this algorithm.In particular, building a QRAM for each of the oracles requires OpT n log nq time and OpT nq space.
We note that in both the classical and quantum settings, we know the identity of the assets that we are investing in after we have sampled the corresponding indices.In the quantum setting, we do not perform full tomography of the portfolio vector and hence do not incur the corresponding cost.We provide a comment on the online setting in contrast to the standard Markovitz mean-variance portfolio optimization.The online setting takes into account variance and covariance of the asset prices implicitly via the time series of prices relatives.The algorithms are favourable when the asset prices have bounded relative volatility [69], because they assume knowledge of the upper and lower bounds on the price relatives.Since r min ď ρ ptq j ď 1, the variance of each entry of the price relatives and the covariance between entries are upper bounded by p1´r min q 2 4 by Fact 4 and hence the volatility (standard deviation) is 1´r min 2 .Thus, the maximum volatility of the market is taken into account by the bounds on the price relatives.
In our setting, the transaction cost was taken to be a constant for each investment, independent of the size of the investment.This models the fact that for each investment some fixed amount of work has to be performed, e.g., the communication of the trade between counterparties and the transfer of the asset.This type of transaction cost serves to illustrate the benefits of the sampling algorithm over the standard algorithm.For future work, one can consider imposing additional constraints on the portfolio optimization problem.For instance, a common optimization is to minimize transaction cost via including a term }w ptq ´wpt´1q } 1 in the portfolio optimization problem or consider portfolio optimization in the robust setting, where the parameters belong to an uncertainty set.Various flavours of robustness such as constraint, objective and relative robustness in conjunction with different types of uncertainty sets [70] are also worth investigating.
We use the shorthand notation EGpηq if the other inputs are clear from the context.

Theorem 2 . 1 r min b 1 2T
Let δ P p0, 1{2q and LS EG as in Algorithm 1. Algorithm 2 outputs LS samp with LS EG ´LS samp ď with success probability at least 1 ´2δ.With η " 2r min b

F
can be achieved using the quantum circuits for basic arithmetic operations.

table below :
Algorithm 2 Sampling-based Online Portfolio Optimization Algorithm Input: n, s, η, T , C. 1: Initialize w p1q " `1 n , ¨¨¨, 1 n ˘P R n .2: for t " 1 to T do Let ζ P p0, 1{2s and z ą 0 be given such that |z ´ v 1 | ď ζ v 1 .Let δ P p0, 1q.An The total run time is O ´T 3 Condition the following argument on all probabilistic steps of the algorithm succeeding, which occurs with probability 1 ´δ from the union bound.At each time step t P rT s, the quantum algorithm produces a portfolio vector wptq .Similar to the proof of Thm. 2, we define the random variable Y ptq " 1