The constant solution method for solving large scale differential Sylvester matrix equations with time invariant coe�cients

This paper is mainly focused on the solution of Sylvester matrix differential equations with time independent coefﬁcients. We propose a new approach based on the construction of a particular constant solution which allows to construct an approximate solution of the differential equation from that of the corresponding algebraic equation. Moreover, when the matrix coefﬁcients of the differential equation are large, we combine the constant solution approach with Krylov subspace methods for obtaining an approximate solution of the Sylvester algebraic equation, and thus form an approximate solution of the large-scale Sylvester matrix differential equation. We establish some theoretical results including error estimates and convergence as well as relations between the residuals of the differential and its corresponding algebraic Sylvester matrix equation. We also give explicit benchmark formulas for the solution of the differential equation.To illustrate the efﬁciency of the proposed approach, we perform numerous numerical tests and make various comparisons with other methods for solving Sylvester matrix differential equations.


Introduction
Differential Lyapunov and Sylvester equations are involved in many areas of applied mathematics and arise in numerous scientific applications.For instance, they play a crucial role in control theory, model order reduction, image processing and the list is not exhaustive.In particular, the differential Lyapunov matrix equation is a useful tool for stability analysis and control design for linear time-dependent systems [2,3].In this paper, we are concerned with numerically solving the differential Sylvester matrix equation of the form where [t 0 , T ] ⊂ R is a closed and bounded time interval with t 0 , T are the initial and final times respectively.We set ∆ T = T − t 0 , the length of the interval [t 0 , T ].The coefficient matrices A ∈ R n×n , B ∈ R s×s and C ∈ R n×s are constant real matrices.The differential Lyapunov matrix equation corresponds to the symmetric case where B = A T .Before describing the new proposed method, we refer to the algebraic equation canonically associated to (1) as the corresponding (or associated) algebraic Sylvester equation.To the best of our knowledge, despite the importance of differential matrix equations, few works have been devoted to their numerical resolution when the matrix coefficients are large.Adaptation of BDF and/or Rosenbrock methods has been described in [7,8] (see also the references in [5]).However these adaptations usually suffer from a problem of numerical data storage.
To remedy this problem, combining Krylov subspaces techniques with BDF methods or with Taylor series expansions have recently been proposed [5,19].Other existing methods described in the recent literature for solving large-scale differential Sylvester matrix equation rely on using the integral formula or some numerical ODE solver [20,37].The strategy we pursue in this manuscript is different in the sense that our approach for solving differential Sylvester (or Lyapunov) matrix equations is based on the use of the constant solution to the differential equation.The first result we use indicates that the solution of the differential equation is written in terms of the solution of the corresponding algebraic equation.Additionally, in the case where the coefficient matrices A and/or B are large, we combine the new expression of the solution with some projection techniques on Krylov subspace, such as the block Arnoldi algorithm for solving the corresponding algebraic equations or for approximating the exponential of a matrix.we momentarily assume that the matrix C in (1), represents a continuous matrix function C(t) defined on [t 0 , T ] and let us consider next, the following classical differential linear system where A ∈ R p×p is time independent and x(t), c(t) ∈ R p for all time t ∈ [t 0 , T ].We assume that c is a continuous function on the interval [t 0 , T ].In this case, the system (3) has a unique solution x(t) which is differentiable with a continuous derivative on [t 0 , T ].It is well known that the unique solution of (3) may be written under the form In many practical situations, the matrix A may be very large.In this case, exploiting expression (4) for computing the solution x(t) is very expensive.However, when the matrix A is large and sparse with a specific structure, some numerical techniques may be done to approximate the matrix exponential time a vector [24,32,35].As, our interest is to obtain a good approximate solution to the differential Sylvester equation (1), we need to consider the matrix A in system (3) of size p = n s structured as following: where the matrices A, B are those given in (1).The identity matrices I n , I s are of size n and s, respectively.The notations B T and ⊗ stand for the transpose of B and the Kronecker product of two matrices, respectively.We recall that the Kronecker product of two matrices J and K of size n j × m j and n k × m k , respectively, is the matrix J ⊗ K = [J i, j K] of size n j n k × m j m k .The following well known properties will be used throughout this paper: 1.

vec(A BC) = (C T ⊗ A) vec(B).
The vec operator consists in transforming a matrix into a vector by stacking its columns one by one to form a single column vector.Let x 0 , x, c be the vectors such that x 0 = vec(X 0 ), x = vec(X), c = vec(C) where X 0 , X(t) and C are appearing in (1).In [1], it was written, without any specific details of its proof, that the solution of the differential Sylvester matrix equation ( 1) is given by the following integral formula X(t) = e (t−t 0 ) A X 0 e (t−t 0 ) B − t t 0 e (t−u) A C(u) e (t−u) B du. ( Although the proof of this result does not present any major difficulty, it seemed interesting to us to give the details of such a proof.Indeed, using the additive commutativity of the matrix exponential which states and since, the matrices I s ⊗ A and B T ⊗ I n commute, then using the properties of the Kronecker product, it follows that Thus, e (t−t 0 ) A x 0 = e (t−t 0 ) B T ⊗ e (t−t 0 ) A vec(X 0 ) = vec(e (t−t 0 ) A X 0 e (t−t 0 ) B ), Finally, this implies that the formula (4) giving the solution to the system (3) leads to the formula (5) giving the solution to the differential Sylvester equation (1).From now on, return back to the case where the vector coefficient c in (3) and also the matrix coefficient C in (1) are assumed to be constant functions on the interval [t 0 , T ].Thus, as the systems (1) and ( 3) are mathematically equivalent, then for moderate size problems, it is possible to apply directly a numerical integration to the system (3) or to use (4) or the above integral formula (5).But, for a large system, we will see that it is more interesting to solve the system in the matrix form (1).
Note that it is not restrictive to choose X 0 = 0 in the initial condition X(t 0 ) = X 0 of (1).Indeed, let X(t) denote the exact solution of the system (1), then the matrix function given by Y (t) = X(t) − X 0 is the unique solution of the following system We may first solve the previous differential equation to get Y (t) and then deduce the solution X(t) = Y (t) + X 0 of the differential equation (1).
To end this section, we give some useful notations.The Frobenius inner product is defined by where tr(M) denotes the trace of a matrix M. The associated Frobenius norm is denoted by ∥Y ∥ = ⟨Y,Y ⟩.For a bounded matrix valued function G defined on the interval [t 0 , T ], we consider the following uniform convergence norm The outline of this paper is as follows: In Section 2, we introduce our proposed method which will be called the Constant Solution Method (CSM in short), describe some of it's properties and give some theoretical results.A summarized and brief description of the corresponding algorithm, will also be given.In Section 3, we combine CSM with the block Arnoldi algorithm for solving algebraic Sylvester matrix equations in order to tackle largescale differential Sylvester equations.The two cases: full and low-rank are discussed.Moreover, we establish theoretical results expressing the residual of the differential equation in terms of that of the algebraic equation.We also establish some theoretical results on the convergence and on the error estimates provided by the constant solution method.In section 4 which is devoted to numerical experiments, we first show how to generate a benchmark differential Sylvester matrix equation with a known exact solution.To the best of our knowledge, this construction is new and has never been proposed before.The numerical results section continues with several set of experiments whose results indicate that CSM is an efficient and robust method.As usual, the last section is devoted to a brief conclusion.
2 The constant solution method for the differential Sylvester matrix equation In the integral Formula (5), quadrature methods are needed to compute numerically the approximate solution.Thus, when one (or both) of the matrix coefficients A or B is (or are) large and has (or have) no particular exploitable structure, the computation of the integral may be expensive or even unfeasible.In this section, we use another expression for the solution of the system (1) which is given in terms of the solution of the corresponding algebraic Sylvester matrix equation (2).This expression avoids the use of quadrature methods since it does not contain an integral.To the best of our knowledge, the approach we describe in this section has never been exploited in the context of solving large scale differential Sylvester matrix equations.However, it is based on the classical and simple technique of adding a particular constant solution to the general solution of the homogeneous differential equation to form the general solution of a linear differential equation of order one with constant coefficients.Next, we give the following theorem, which gives a useful and interesting expression of the unique solution of the system (1).The result of this theorem is known in the literature [6,15], but, in practice, it has not been exploited numerically to give approximate solutions.This theorem is not difficult to establish.However, in order to facilitate the reading of the present work, it seems interesting to us to give the proof of this theorem.
Theorem 1 Suppose that the matrices A and B in the system (1) are such that σ (A) ∩ σ (−B) = / 0, where σ (M) denotes the spectrum of the matrix M, then the unique and exact solution X * (t) of the system (1) is given by where X * is the unique and exact solution of the algebraic Sylvester equation (2).
Proof The general solution of the homogeneous differential equation associated to (1) is given by where Y ∈ R n×s is some constant matrix.Since σ (A) ∩ σ (−B) = / 0, the algebraic Sylvester equation (2) has a unique solution X * (see, e.g [25,Thm. 2.4.4.1]).Now, as X * is a constant matrix function (i.e., X * (t) = X * ), then it can be seen as a particular solution of the differential equation It follows that the general solution of the previous differential equation is given by Finally, since the unique solution of the differential equation (1) must satisfy the initial condition X * (t 0 ) = X 0 , it follows that X * (t 0 ) = e t 0 A Y e t 0 B + X * = X 0 .The last equality implies Y = e −t 0 A X 0 − X * e −t 0 B and expression (6) follows immediately.
In the remainder of this paper, we assume that the matrices A and B in (1) satisfy the condition The following property shows the behavior of the matrix solution X * (t) as the interval [t 0 , T ] becomes very more and more large, namely, as the final time T goes to +∞.
Proposition 1 [29, Chapter 8] Suppose that the coefficients A and B in the system (1) are stable matrices, then the unique solution X * (t) of the differential system (1) satisfies where X * is the unique solution of the corresponding algebraic Sylvester equation (2).
In the remainder of this section, we suppose that the matrix coefficients A and B are of moderate size.In this case, an approximate solution to the algebraic Sylvester equation (2) may be obtained by a direct solver such as the Bartels-Stewart algorithm, the Schur decomposition, or the Hammarling method [4,18,21,30,40].A common point to all these methods is first the computation of the real Schur forms of the coefficient matrices using the QR algorithm.Then, the original equation is transformed into an equivalent form that is easier to solve by a forward substitution.Now, suppose that X a is an approximate solution to X * the exact solution of the Sylvester algebraic equation (2), it follows that an approximate solution X a (t) to X * (t) the exact solution of the Sylvester differential equation ( 1) can be expressed in the following form.
Here, as A and B are assumed to be of moderate size, we also assume that both exponential e (t−t 0 ) A and e (t−t 0 ) B are computed exactly.To establish an upper bound for the error norm, let us introduce the algebraic error E and the differential error E(t) given by E = X * − X a , and respectively.Finally, recalling that ∆ T = T − t 0 and ∥E∥ ∞ = sup

||E(t)||, we have the following result
Proposition 2 In the case where the matrix exponential is computed exactly, we have where E and E are the errors associated to the approximate solutions X a (t) and X a respectively.
Proof Subtracting (7) from ( 6), we get and from the triangular inequality, we obtain The Frobenius norm being multiplicative (that is ∥A B∥ ≤ ∥A∥ ∥B∥), this implies that ∥e s M ∥ ≤ e s ∥M∥ for all s ≥ 0 and for any square matrix M. Thus, As E is a continuous matrix function on the interval ||E(t)||, then the desired result follows obviously.
Le us now introduce R(t) and R the residuals associated to the differential and algebraic Sylvester matrix equations, respectively.These residuals are defined by and satisfy the following proposition.
Proposition 3 In the case where the matrix exponential is computed exactly, the residual for the differential equation, is time independent and we have Proof From (7), we have On the other hand, we have Then subtracting one of the two previous relations from the other, we get Before ending this section, we sketch in Algorithm 1 below the main steps that must be followed to obtain approximations X k = X a (t k ) to the solution of the differential Sylvester equation (1) at different nodes t k , (k = 1, . . ., N) of a suitable discretization of the time interval [t 0 , T ].

Algorithm 1 Constant Solution Method in the case of moderate size (CSM)
1: Input: The matrices A, B, C, the initial and final times t 0 , T , the number N of nodes and the step time δ T .2: Output: X 1 , . . ., X N , (where X k = X a (t k )), (1 ≤ k ≤ N) 3: Solve the algebraic Sylvester equation: A X + X B = C, to get an approximate solution X a to the exact solution X * .4:

Block Arnoldi for solving large-scale differential Sylvester matrix equations
It is well known that computing the matrix exponential may be expensive when the matrix is very large.Thus, expression (6) may not be directly exploitable in the case of large scale matrix coefficients.In the following, we will see how to circumvent this difficulty using projection methods onto some Krylov subspace.Indeed, in addition to allowing us to obtain a good approximation of the exact solution of the algebraic Sylvester equation ( 2), Krylov subspace methods are also a useful tool to compute the action of matrix exponential on a block vector with a satisfactory accuracy.During the last three decades, various projection methods on block, global or extended Krylov subspaces have been proposed to solve Sylvester matrix equations (or other similar equations) whose coefficients are large and sparse matrices [9,10,11,16,22,23,26,27,28].The common idea behind these methods is to first reduce the size of the original equation by constructing a suitable Krylov basis, then solve the obtained low dimensional equation by means of a direct method such as the Hessenberg-Schur method or the Bartels-Stewart algorithm [4,18], and finally recover the solution of the original large equation from the smaller one.For a complete overview of the main methods for solving algebraic Sylvester or Lyapunov equations, we refer to [3,13,39] and the references therein.In order to be as general as possible and not to impose restrictive assumptions, we opt for a resolution of the Sylvester (or Lyapunov) equation using the block Arnoldi process rather than the extended block Arnoldi process since the latter requires that the coefficient matrices A and B are non singular.This last condition may not be fulfilled in many practical cases.
We recall that projection techniques on block Krylov subspaces for solving matrix differential equations were first proposed in [19,20] by exploiting the integral formula ( 5) and approximating the exponential of a matrix times a block of vectors or by solving a projected low-dimensional differential Sylvester matrix equation by means of numerical integration methods such as the backward differentiation formula (BDF) [12].
As said before, the approach we follow in this work is different from the one proposed in [19,20].It consists of exploiting formula (6), instead of the integral formula (5), which is less expensive.To have at hand an adequate basis of the considered Krylov subspace, we will use the block Arnoldi process described in the next subsection.

The block Arnoldi process
Let M be an l × l matrix and V an l × s block vector.We consider the classical block Krylov subspace The block Arnoldi process, described in Algorithm 2, generates an orthonormal basis V M m of the block Krylov subspace K m (M,V ).

Algorithm 2
The block Arnoldi process (BA) 1: Input: M a matrix of size l × l, V a matrix of size l × s and m an integer.2: end for 10: Get V j+1 and H j+1, j by computing the QR decomposition of U, 11: i.e., U = V j+1 H j+1, j .12: Set ) 1≤k≤ j+1,1≤ℓ≤ j 14: end for Suppose that the upper triangular matrices H j+1, j are full rank then, since the above algorithm involves a Gram-Schmidt procedure, the obtained block vectors V 1 ,V 2 , . . .,V m (V i ∈ R l×s ) have their columns mutually orthogonal.Hence, after m steps, Algorithm 2 generates an orthonormal basis ) and a block upper Hessenberg matrix H M m whose non zeros blocks are the H i, j ∈ R s×s .We have the following and useful algebraic relations [17,36]. where m is the matrix of the last s columns of the m s × m s identity matrix I m s , i.e.E (s) m = [0 s×(m−1) s , I s ] T .In the following, we will use the notation

Full rank case
Here, we suppose that A is a large matrix while B is relatively smaller, i.e., s ≪ n.We also assume that the righthand side C is full rank, i.e. rank(C) = s.Also, for simplicity reasons, we took X 0 = 0 in the initial condition, since, as mentioned in the introduction, it is not restrictive.
To obtain approximate solutions to the algebraic Sylvester equation ( 2), one can use the block Arnoldi method in which we consider approximate solutions that have the following form where V A m is the orthonormal Krylov basis generated by applying m iterations of Algorithm 2 to the pair (A,C).Let R m be the algebraic residual given by The correction Y m , is obtained by imposing the Petrov-Galerkin condition Thus, taking into account the relations ( 9)-( 11) and (13), it follows that Y m is the solution of the reduced Sylvester equation , then the previous Sylvester equation admits a unique solution which can be obtained by a direct method [4,18].In addition, from the relations ( 9)- (11), the residual R m satisfies the following relation According to [32,34], the following approximation to e (t−t 0 ) A X * holds It follows, that an approximate solution X m (t), for t ∈ [t 0 , T ], to the exact solution X * (t) of the differential Sylvester matrix equation (1) may be obtained by Taking into account (13), it follows that where This matrix function satisfies the following result.

Proposition 4
The matrix function Y m (t) given by (17) is the unique solution of the reduced differential Sylvester matrix equation Proof The derivative of the matrix function Y m (t) as given by ( 17) is On the other hand, we have Thus, it follows that Remark 1 Proposition 4 shows that another way to obtain an approximation of Y m (t) can be the resolution of the projected and reduced differential equation (17) by using an adequate numerical ODE solver such as Runge-Kutta or BDF solvers.We recall that such technique was used in [19,20].In our proposed method, we don't use such approach, but instead, we solve the reduced algebraic equation and take into account approximations (16) and (18) to get an approximate solution to the low dimensional differential matrix equation (17).Now, let R m (t) be the residual associated to the approximate solution X m (t), i.e., The following proposition gives an expression for this residual.

Proposition 5
The residual for the differential equation is given by where R m is the algebraic residual given in (14).Moreover Proof Replacing, in (19), X m (t) by its expression given by ( 16), we get is the solution of (18), we then get (20).Now, according to (17), we obtain Then, using (21), we get lim In addition and as done in the previous section, we consider the differential error E m given by We recall that X m (t) and X * (t) are the approximate and exact solutions to the differential equation, respectively.
The following result gives an error estimate for error norm E m .
Theorem 2 At step m, the following error estimate holds where with Y m , Z m are the matrices of size s × s formed by the s last rows of Y m and Z m := e ∆ T H A m Y m e ∆ T B respectively.
Proof From ( 8) and the differential Sylvester matrix equation ( 1), we have with E m (t 0 ) = 0. Thus, the function E m (t) satisfies the following differential Sylvester matrix equation So, E m (t) may be written by the following integral formula Passing to the norm, for all t ∈ [t 0 , T ], we get As, ∥e α M ∥ ≤ e α∥M∥ for α ≥ 0 and M = A or M = B, we obtain that, for all t ∈ [t 0 , T ], This gives after integration that, for all t ∈ [t 0 , T ] we have Then using (21) and the triangular inequality, we get Finally, the desired result is obtained by passing to the uniform norm.Now, we point out that (20) provides a cheap formula for computing at each node t k the norm r m,k := ∥R m (t k )∥ of the residual associated to the approximate solution X m,k := X m (t k ).This formula avoids computing matrix vector products with the large coefficient matrix A since we have where is the matrix of size s × s formed by the last s rows of the matrix Y m,k := Y m (t k ).Finally, we end this section by summarizing in Algorithm 3 our proposed method that is the block Arnoldi combined with the constant solution method (BA-CSM) applied for full-rank differential Sylvester equations Algorithm 3 Block Arnoldi Constant Solution Method (BA-CSM) (Full-rank case) 1: Input: The matrices A, B, C, the initial and final times t 0 , T , a tolerance tol > 0, a maximum number of iterations M max , a step-size parameter p and N the number of nodes in the time discretization.2: if m is a multiple of p then 7: Compute: Solve the reduced Sylvester equation: for k = 1, . . ., N do 10: Compute end for 14: Compute The approximate solution

Low-rank case
Now, we consider the case where both A and B are large matrices.In addition, the coefficient C appearing in the right-hand side is assumed to be low rank and is given under the factored form C = E F T where E ∈ R n×r and F ∈ R s×r .We also, assume for simplicity reasons that the initial condition is such that X 0 = 0. To obtain approximate solutions to the low-rank algebraic Sylvester equation ( 2), we can use the block Arnoldi method in which we consider approximate solutions that have the form where V A m , V B m are the orthonormal matrices obtained by running m iterations of Algorithm 2 applied to the pairs (A, E) and (B T , F) respectively.Enforcing the following Petrov-Galerkin condition Multiplying (25) on the left by (V A m ) T and on the right by V B m and taking into account relations ( 9)-( 11) and ( 24), it follows immediately, that Y m is the solution of the reduced projected Sylvester equation where are the m r × m r upper block Hessenberg matrices generated by the block Arnoldi process and 26) admits a unique solution which can be computed using a standard direct method such as those described in [4,18].Using the relation ( 9)-( 10) and from the relations ( 24)-( 26), we get where m ) T .We also notice that, according to [32,34], an approximation to e (t−t 0 ) A X * e (t−t 0 ) B may be obtained as Then, it follows, that an approximate solution X m (t) to the exact solution X * (t) of the differential Sylvester matrix equation (1) may be given by Taking into account (24) gives that where As in the full-rank case, we have the following proposition.

Proposition 6
The matrix function Y m (t) given by (29) is the unique solution of the reduced differential Sylvester matrix equation Proof The derivative of the matrix function Y m (t) as given by ( 29) is On the other hand, we have Thus, it follows that Next, the following proposition gives a useful expression of the residual which is defined, in the low-rank case, by Proposition 7 The residual for the differential equation is given by where F m (t) = e (t−t 0 ) H A m Y m e (t−t 0 ) (H B m ) T and R m is the algebraic residual given by (25).In addition, Proof Using the definition (31) of the residual R m (t) and replacing X m (t) by its expression given in (28), we get Now, using the algebraic relation ( 9) in which M is replaced either by A or by B, we obtain This may be arranged as following Taking into account (30), we get (32).The relation (33) follows by replacing Y m (t) by its expression (29) and taking into account (27).Finally, (34) is straightforward since V A m and V B m are orthogonal matrices.
Similarly to the full rank case, let us remark that if H A m and H B m are stable, then lim As in the previous subsection, we have the following error estimates.
Theorem 3 Let X m (t), for t ∈ [t 0 , T ], be the approximate solution at a step m given by ( 28) and (29) and let E m (t) = X * (t) − X m (t) be the error.Then, we have the following error estimate: Proof As previously done in the proof of Theorem 2, we obtain by similar arguments that, for all t ∈ [t 0 , T ] we have From ( 33) and ( 27), we get Now, using the triangular inequality, we get that for all t ∈ [t 0 , T ], we have and similarly, we also have which completes the proof.
To continue the description of the present method, we notice that (32) enables us to check if ∥R m (t)∥ < tol -where tol is some fixed tolerance-, without having to compute extra products involving the large matrices A and B.More precisely, we have We end this subsection by recalling that in the case of large scale problems, and as suggested in [23,38], it is important to get the approximate solution X k := X m (t k ) at each time t k as a product of two low rank matrices.If is the diagonal matrix of the singular values of Y k sorted in decreasing order, then by considering V l and W l the m r × l matrices of the first l columns of V and W corresponding respectively to the l singular values of magnitude greater than some tolerance τ, we get for each k where The block Arnoldi combined with the Constant Solution Method (BA-CSM) for solving the differential Sylvester matrix equation, in the case where C is low rank, i.e., C = EF T , is summarized in Algorithm 4.

Algorithm 4 Block Arnoldi Constant Solution Method (BA-CSM) (Low-rank case)
1: Input: The matrices A, B, E, F, the initial and the final times t 0 , T , a tolerance tol > 0, a maximum number of iterations M max , a step-size parameter p, the number N of nodes in the time discretization and the tolerance τ for the truncated SVD.2: Output X m,1 , . . ., X m,N , where if m is a multiple of p then 7: Compute: Solve the reduced Sylvester equation: for k = 1, . . ., N do 10: Compute end for 14: Compute Find l such that σ l+1 ≤ τ < σ l and let Σ l = diag[σ 1 , . . . ,σ l ]; 23: Form The approximate solution X m,k at time In this section, a series of numerical tests will be presented to examine the performance and potential of Algorithms 1, 3 and 4. We have compared our proposed method which is based on relation (6) with the one described in [20] and which is based on the integral formula (5).We recall that the algorithms described in [20] only provide an approximate solution at the final time T and moreover they only deal with the case of low-rank differential equations.Thus, we modified Algorithm 1 proposed in [20] so that it provides an approximate solution X m,k = X m (t k ) at each node t k of the discretization of the time interval [0, T ] as it is the case in Algorithm 4.Moreover, we have drafted two other codes based on the integral formula (5) and equivalent to Algorithm 1 and Algorithm 3 It should be noted that in all the examples given here, we suppose that X 0 the matrix appearing in the initial condition of ( 1) is equal to zero, i.e., X 0 = 0 n×s .Furthermore, we consider different time intervals [t 0 , T ] where t 0 = 0 is fixed once and for all, while T is indicated in each example.The time interval [0, T ] is divided into sub-intervals of constant length δ T = T N where N is the number of nodes.All the numerical experiments were performed using MATLAB and have been carried out on an Intel(R) Core(TM) i7 with 2.60 GHz processing speed and 16 GB memory.In order to implement the different algorithms described in this work, we used the following MATLAB functions: expm: it allows to calculate the exponential of a square matrix.This function is based on a scaling and squaring algorithm with a Padé approximation [24].lyap: it allows to solve Sylvester or Lyapunov matrix equations.For our purposes, the instruction lyap(A,B,-C) delivers the matrix X solution of the algebraic Sylvester equation A X + X B = C. integral: it allows to calculate numerically an integral, using the arguments "ArrayValued" and "true".Furthermore, we precise that when the constant solution or integral formula methods are combined with the block Arnoldi process to obtain an approximate solution to the differential equation, the iterations are stopped as soon as the dimension of the Krylov subspace generated by the block Arnoldi process reaches a maximum value m = M max = 110 or as soon as the maximal norm r max computed by the algorithm is lower than 10 −10 µ where µ = ∥A∥ + ∥B∥ + ∥C∥ in the full rank case and µ = ∥A∥ + ∥B∥ + ∥E∥ ∥F∥ in the low rank case.We also mention that in the numerical examples, the right-hand side C or its factors E and F were generated randomly.
To compare the performances of the Constant Solution method (in short CS or CS-BA when combined with the block Arnoldi process) with those of the Integral Formula method (in short IF or IF-BA when combined with the block Arnoldi process), we used the following comparison criteria: -TR: the time ratio between the CPU-time of CS and IF methods or between CS-BA and IF-BA methods which are defined by -RDN: the relative difference norm between X CS-BA and X IF-BA which are the approximate solutions delivered by the constant solution and the integral formula methods respectively when they are combined the block Arnoldi process.
We point out that this criteria is used when the exact solution of the differential Sylvester equation is not available.-REN: the relative error norm between the exact solution and an approximate solution obtained either by a constant solution based algorithm or by an integral formula based algorithm.More precisely, letting X Exact be the exact solution computed by (37), we define the following quantities : Before to start the numerical experiments and tests, we will show in the next subsection, how to construct a differential Sylvester equation which have a known exact benchmark solution.

A benchmark example
To the best of our knowledge, there is no Known benchmark explicit solution for large-scale differential and Sylvester matrix equations.Here, we show how to construct a benchmark differential Sylvester equation whose exact solution is known and with which we can confront the approximate solutions provided by the different compared methods.
Let K, R ∈ R p 0 ×p 0 be two nilpotent matrices, of index p 0 ≥ 3 i.e., K p 0 = R p 0 = 0 p 0 and let A 0 ∈ R n 0 ,×n 0 , B 0 ∈ R s 0 ,×s 0 , X 0 , C ∈ R n×s where n = p 0 n 0 and s = p 0 s 0 .The integer p 0 is a small integer.We also choose two real numbers α, β and we consider the matrices We easily have, for any real t and for any matrix X of size n × s: Assuming that α + β < 0, then the unique solution X * of the algebraic matrix Sylvester equation AX + XB = C is given by the formula (see [31]), Then, using the formula (6), the unique solution of the differential Sylvester matrix equation satisfying the initial condition X(t 0 ) = X 0 , is the matrix function X * (t) given by It follows that We may also obtain the solution X * (t) of the differential Sylvester matrix equation by using the integral formula (5), since we have It follows that, where the scalar functions I k (t) are given by I k (t) = t t 0 (t − u) k e (α+β ) (t−u) du.The expression of the functions I k (t) are obtained by recursion.Indeed, we have and for k ≥ 1 by parts integration, we have Then, we may show by induction that, for all k ≥ 0, we have Before ending this subsection, let us remark that in this benchmark example, the matrix C is arbitrary and then can also be taken in the low-rank form C = E F T , where E ∈ R n×r and F ∈ R s×r .
In addition, to show the strengths and limitations of compared methods in various experimental settings when using this benchmark example, we choose p 0 = 3 and we considered different values for the parameters α and β as well as different matrices A 0 and B 0 .The matrices K and R are fixed once and for all, as follows

Experiment 1
In this first example, the numerical tests are done with moderate size matrices A and B. We compare the solution provided by our proposed constant solution method implemented via Algorithm 1 with the one obtained using the integral formula (5) as well as with the solutions given by some classical ODE's solvers from Matlab.The solvers ode15s, ode23s, ode23t and ode23tb are usually used for stiff ODE's, while the other solvers ode45, ode23 and ode113 are used for non stiff ODE's.Note that since some ODE solvers behave similarly and in order not to overload the plots, we only give the results obtained with the four methods ode15s, ode23s, ode23tb and ode45.In the following two experiments, we consider the time intervals [0, T ] with T is either T = 1 with the number of nodes is N = 10 or T = 10 with the number of nodes is N = 50 which means that the step time is δ T = 0.1 when T = 1 while δ T = 0.2 when T = 10.Here, we consider the matrices A 0 =gallery('leslie',n 0 ), B 0 = gallery('minij',s 0 ) with n 0 = 50 and s 0 = 10 and the coefficient matrices A, B of the differential Sylvester equations are generated by (36), as explained in the benchmark example.The parameters α, β are equal to −2 and −1 respectively.As the matrices K, R are those given at the beginning of section 4, the size of the matrices A, B are now n = 150 and s = 30 respectively.Here, we point out that the solution computed by Algorithm 1 and those computed by the Algorithm based on integral formula or issued by the Matlab ODE solvers are compared to the exact one given in (37) which is considered as the reference solution X ref .Thus, in the plots, we represent the behaviour of the norm of the relative error k ∥ as a function of t k where t k = k δ T .The obtained plots and results are reported below in Figure 1 and Table 1 respectively.The analysis of results obtained in Experiments 1 shows on the one hand that the CS and IF methods return the best results in terms of the error norm.The ode45 solver is the best among the other Matlab solvers, but its performance does not match that of the CS and IF methods.

Experiment 2
In this set of numerical tests, the experiments are done with a relatively large matrix A and a moderate size matrix B. We compare the performances of Algorithm 3 -which implements the CS-BA method-and the equivalent algorithm based on the integral formula combined with the block Arnoldi (IF-BA).
Experiment 2.1.The matrices A and B are obtained from the centered finite difference discretization of the operators on the unit square [0, 1] × [0, 1] with homogeneous Dirichlet boundary conditions where To generate the coefficient matrices A and B, we used the fdm 2d matrix function from the LYAPACK toolbox [33] as following A=fdm 2d matrix(n 0 , f A ,g A ,h A ) and B=fdm 2d matrix(s 0 , f B ,g B ,h B ) where n 0 and s 0 are the number of inner grid points in each direction when discretizing the operators L A and L B respectively.This gives A ∈ R n×n , B ∈ R s×s with n = n 2 0 and s = s 2 0 .We examine the performances of CS-BA and IF-BA for four choices of n 0 and s 0 which are (n 0 , s 0 ) = (30, 3), (n 0 , s 0 ) = (50, 3), (n 0 , s 0 ) = (30, 5) and (n 0 , s 0 ) = (50, 5).The considered time intervals are [0, T ] where T = 1 and N = 10 or T = 2 and N = 20.This means that the step time is always δ T = 0.1.In Table 2, we reported the time ratio (TR) and the relative difference norm (RDN) between the CPU-time of CS-BA and IF-BA.Experiment 2.2.In this test, we took A 0 =gallery('hanowa',1500,-5) and B 0 = gallery('leslie',6) from the Matlab gallery and transform them into A et B of sizes n = 4500 and s = 18 respectively by using (36) in which we took α = −7 and β = −5.The obtained results for different time intervals which are summarized in Table 3 include the time ratio TR and the relative error norms REN CS-BA , REN IF-BA between the approximate solutions X CS-BA , X IF-BA given by CS-BA and IF-BA respectively and the X Exact the exact solution computed by (37).

Experiment 3
We describe and report here the results of numerical experiments carried out when solving large scale low-rank differential Sylvester or Lyapunov equations.The performance of CS-BA is compared with that of IF-BA.The test matrices come either from the centred finite difference discretization of the operators L A and L B defined in the previous experiment, or from the Florida suite sparse matrix collection [14].The invoked matrices for our tests from this collection are: pde900, pde2961, cdde1, Chem97ZtZ, thermal, rdb5000, sstmodel, add32 and rw5151.
Experiment 3.1 (a).In this example, the numerical results are those obtained from solving differential Sylvester equations.The time interval is fixed to [0, 1], (T = 1).The number of nodes is N = 10 which gives a step time δ T = 0.1.The matrices A ∈ R n×n and B ∈ R s×s come from the discretization of the operators L A and L B .As indicated previously, the coefficients of the right-hand side E, F ∈ R n×r are randomly generated.The obtained results for different sizes n, s and ranks r are summarized in Table 4. Experiment 3.1 (b).Here, we consider two different time intervals [0, T ] for T = 1 and T = 10 in which the number of sub-intervals is always N = 10.The matrix A is from the Florida sparse matrix collection.We consider the particular case B = A T and F = E and report the results obtained when solving low-rank differential Lyapunov equations.The obtained results for r = 2, r = 5 or r = 10 are displayed in Table 5.
Table 5 The obtained times ratio TR and relative difference norms RDN in Experiment 3.1 (b).Experiment 3.2 (a).Here, we consider A 0 =pde2961 and B 0 =pde900 and transform them into A et B of sizes n = 8883 and s = 2700 respectively by using (36).In order to confirm the influence of the rank r and/or length T of the time interval, on the performances of the CS and IF methods, we report in Table 6 the results obtained for two cases :  We notice that in most of tests, both methods manage to provide a good approximate solution and that the CPU time is in favor of the BA-CS method.However, we observed that for small values of α and β and when the values of r and T are large, the BA-IF method failed to converge within a reasonable time.The non-convergence is indicated by "− − −".Experiment 3.2 (b).In this last set of experiments, we compare the performances of the CS and IF methods when they are applied to the solution of low-rank differential Lyapunov equations.Unlike the previous series of tests, we did not generate a discretization for the interval [0, T ] and only calculated the approximation X(T ) at the final time, where T = 10.Similarly, the rank of C = E E T does not vary and is r = 20.For each experiment with a matrix A 0 -which is taken from the Florida sparse matrix collection [14]-, we considered four values for the scalar α that was used in the generation of the benchmark example.The size n 0 of each matrix A 0 , the size n of the benchmark matrix A as well as the obtained results are reported in Table 7.

Conclusion
In this work, we have proposed new techniques for solving Sylvester and Lyapunov matrix differential equation.Unlike the recent method proposed in [20], our method avoid the integral formula which is very benefit and reduced the computational cost.The proposed method is very efficient for large scale problem by exploiting a projection on Krylov subspaces.Numerous numerical tests are used to show the effectiveness of such proposed method, we have reported some of them in a specific section.The convergence of such method is proved and a constructive benchmark example is given.
• Availability of supporting data."No data sets were generated or analyzed during the current work" • Competing interests."The authors declare no competing interests." • Funding "No funding was provided to support this work • Authors' contributions."The three authors contributed equally to this work" • Acknowledgments.We thank the anonymous reviewers and the editor for any criticism, comments, suggestions, or remarks that will help us improve this manuscript 15), we get(21) and the relation(22) follows immediately.Note that if H A m and B are stable, i.e, all the eigenvalues of H A m and B belong to the half part of C whose real part is negative.It follows that,lim T →+∞ e (T −t 0 ) H A m = 0 s×s and lim T →+∞ e (T −t 0 ) B = 0 s×s .
and z B m = ∥H B m+1,m Z m ∥.The matrices Y m and Z m are of size r × r and formed by the r last rows of Y m and Z m := e ∆ T H A m Y m e ∆ T H B m respectively.
On the other hand, by comparing the time ratios between CS and IF which are TR = 12.578 0.203 ≃ 61 for T = 1 and N = 10 and TR = 75.7030.718 ≃ 105 for T = 10 and N = 50.We clearly see that CS is faster than IF because the former avoids using a quadrature formula as it is the case for the later.

Fig. 1
Fig. 1 Experiment 1: comparison of the relative error norm.The reference solution is given by (37).

Table 7
Numerical results for Experiment 3.2 (b) with T = 10, N = 1 and r = 20.Test ProblemsA 0 = cdde1 A 0 = pde2961 A 0 = sstmodel n 0 = 961, n = 2883 n 0 = 2961, n = 8883 n 0 = 3345, n = 10035 α TR REN CS-BA REN IF-BA V A m and get the m-th block of H A m by applying Algorithm 2 to (A,C); 6: V B m and get the m-th blocks of H A m and H B m by applying Algorithm 2 to (A, E) and (B T , F) respectively; 6:

Table 1
The obtained CPU times (in seconds) in Experiment 1.2.

Table 2
The obtained times ratio TR and relative difference norms RDN in Experiment 2.1.

Table 3
The obtained times ratio TR and relative error norms REN CS-BA , REN IF-BA in Experiment 2.2. with N = 10.

Table 4
The obtained times ratio TR and relative difference norms RDN in Experiment 3.1 (a).

Table 6
The obtained times ratio TR and relative error norms norms REN CS-BA and REN IF-BA in Experiment 3.2 (b).