Generalizing Frobenius Inversion to Quaternion Matrices

In this paper we derive and analyze an algorithm for inverting quaternion matrices. The algorithm is an analogue of the Frobenius algorithm for the complex matrix inversion. On the theory side, we prove that our algorithm is more efficient that other existing methods. Moreover, our algorithm is optimal in the sense of the least number of complex inversions. On the practice side, our algorithm outperforms existing algorithms on randomly generated matrices. We argue that this algorithm can be used to improve the practical utility of recursive Strassen-type algorithms by providing the fastest possible base case for the recursive decomposition process when applied to quaternion matrices.


Introduction
Quaternions are a typical example of hypercomplex number systems, which was first discovered by Hamilton in 1843 [18].Since then, the quaternions have been extensively studied and applied in various fields of mathematics including contact and spin geoemtry [3,26], function theory [38,11], non-commutative algebra [24,6,7], linear algebra [47] and algebraic topology [20].On the other side, since the group of unit quaternions is isomorphic to the group SU(2) consisting of 2 × 2 special unitary matrices, they provide us a natural way to represent spatial rotations. 1Because of such a surprising connection, quaternions are widely applied in areas related to rigid body mechanics and quantum mechanics [21,17,4,13,30,29,34].Quaternions are also useful for combining relating variables into a single algebraic entity that can allow them to be manipulated collectively in a natural way, e.g., representing RGB pixel values using the i, j, and k components of a single quaternion.During the past decade, novel quaternion-based methods and models have emerged in a wide variety of signal processing applications [41,16,44,40].
The increasing prevalence of quaternions in science and engineering has led to increased focus on the development of high-efficiency implementations of core utilities such as the quaternion LU decomposition [42], the quaternion SVD [25], low rank approximation of quaternion matrices [27], and high performance quaternion matrix multiplication [43].
The focus of this paper is on arguably the most fundamental quaternion matrix operation after multiplication: quaternion matrix inversion.Presently there are only four prominent algorithms for matrix inversion over quaternions in the literature [42,47,39,33], and their use includes applications such as computing or estimating the inverse of a covariance matrix in statistics and machine learning [2,40,46]; quaternion batch normalization [14,31]; construction of orthogonal polynomials [12]; the computation of numerical integrals [19]; and hardware design of MIMO radios [37].
The structure of this paper is as follows.We begin with background on the algorithms of matrix inversion and how the most efficient algorithm depends on the type (e.g., real, complex, or quaternion) of the matrix.We then derive an algorithm that is optimized for quaternion matrix inversion.After that, we discuss the computational complexity of our algorithm from two different points of view.Finally, we provide empirical results demonstrating its efficiency advantage over prior methods presented in the literature.

Background
The development of practically efficient methods for manually inverting matrices dates back to the early 20th Century.An important early result due to Frobenius in 1910 was a method for decomposing the inverse of a complex matrix into a form requiring only the evaluation of inverses of real matrices.Interest in methods of this type was renewed in the 1950s and 1960s as a means for inverting complex matrices in contexts in which complex variables were not natively supported, e.g., when numerical libraries contained only optimized routines for real matrices.
In 1969, Strassen showed that a special recursive block decomposition of the matrix multiplication problem could achieve better complexity than the conventional O(n 3 )-time algorithm.Moreover, he showed that this implies related recursive algorithms with the same subcubic complexity for computing the determinant and inverse of a matrix.Refinements of the approach have led to progressively improving computational complexities over the decades [8], with O(n 2.37286 ) being the current best complexity for matrix multiplication [1] and, consequently, matrix inversion.
While Strassen-type algorithms provide better asymptotic computational complexities, their practical utility depends on the absolute efficiency offered for the problem at hand, i.e., whether it is faster than the best cubic-complexity algorithm for the matrix size of interest.More specifically, the practical efficiency of a generic Strassen-type recursion depends on the base-case efficiency of the fastest available cubic-complexity algorithm.In other words, there exists a fixed breakeven matrix size at which it is faster to apply a cubic-complexity algorithm than it is to continue the recursive block decomposition process down to the scalar base case.In the case of real matrix multiplication, the best base-case algorithm is the conventional cubic-complexity matrix multiplication algorithm.In the case of matrix inversion, however, the situation is much more subtle.
In [9], the authors established a surprising result that the Frobenius algorithm for inverting a complex matrix incurs less numerical overhead, and is demonstrably faster, than state-of-theart methods based on LU and QR matrix decompositions.More specifically, they examined the Frobenius inversion method for a complex matrix Z = A + iB: which requires only the evaluation of inverses of real matrices A and B. The expression (1) requires only the A component of nonsingular Z to be nonsingular, but it can be reformulated to require only B to be nonsingular 2 .We now show that a similar approach can be applied to obtain improved computational efficiency for the case of quaternion matrices.

Quaternion Matrices
Space of quaternions, denoted H, generalizes the complex space C by introducing two additional complex components/dimensions, denoted j and k.Thus, whereas a complex number can be represented with two real scalar parameters a and b as a + ib, a quaternion has four real parameters and has the form a + ib + jc + kd, i.e., it has one real and three imaginary components.Quaternions are algebraically important because they form a linear space with properties analogous to matrices, e.g., they do not commute, but retain many valuable properties of complex numbers, e.g., is a normed division algebra.
The imaginary components of a quaternion are analogous to the complex imaginary component in that i 2 = -1, j 2 = -1, and 2 Randomization, e.g., as suggested by Zielinski [48], can reduce assumptions to the minimum of nonsingular Z, but at the expense of additional overhead.In this paper we will focus only on generalizations of (1) with the understanding that our results can be trivially reformulated to satisfy different assumptions about the structure of Z.
with component cross products ij = k, jk = i, and ki = j, and reversing the order of terms multiplies the result by -1, i.e., they anticommute with respect to reverse cyclical lexicographic order: ji = -k, kj = -i, and ik = -j.
While less familiar, practical interest in quaternions has increased steadily since the 1990s, and are now widely used in computer graphics and GIS systems to perform rotations around a specified vector, as well as in automatic control and system analysis applications involving state evolution on a sphere, e.g., circular orbits around the earth.More recently, quaternion matrices, i.e., matrices with quaternion elements, have increased in prominence in applications ranging from representations of color information for image processing to uses in neural networks.
A quaternion matrix can be represented in the following form: for real-valued matrices A, B, C, and D. A natural question, then, is whether the computation of the inverse of a given quaternion matrix can be made more efficient by using a decomposition inspired by that of Frobenius for complex matrices.

Inverting Quaternion Matrices
Our first observation is that the Frobenius method can be applied directly in the case that any two of the matrices B, C, and D are zero.For example, if B and C are zero, then H = A + kD and where the inverse is computed in the j complex plane.From this observation, it can be concluded that any generalized Frobenius-type inverse must satisfy the above with regard to each of i, j, and k when the matrix coefficients on the other two are zero.We observe that there is an inclusion of R-algebras: Moreover, C is a quadratic field extension of R, over which the matrix inversion is thoroughly discussed in [9].As a comparison, we notice that H is not an algebra over C, according to the Gelfand-Mazur theorem [15,28].Thus the method developed in [9] does not apply directly to the inversion in M n (H) over M n (C).However, we still have This implies that given any Z, W ∈ M n (H), we may write where A, B, C, D, E, F, G, H ∈ M n (R).We observe that Thus W is the inverse of Z if and only if Solving ( 7) and ( 8) for E, F, G and H, we obtain the first two items of the lemma that follows.
is also invertible and Proof.It is sufficient to prove (a).To that end, we recall that a quaternion z = a + ib + jc + kd ∈ H can be represented as a 2 × 2 complex matrix a + ib c + id −c + id a − ib , where a, b, c, d ∈ R. Therefore, we may also represent where A, B, C, D ∈ M n (R).As a consequence, inverting Z is equivalent to inverting Z.If A − iB is invertible, then we may invert Z by the Schur complement, i.e., where X denotes the conjugate of a matrix X and Here the invertibility of (A + iB) is ensured by that of A + iB and equations ( 7) and (8).Hence the inverse of Z is and this completes the proof.
Clearly, one can iterate the roles of A, B, C, D in Lemma 4.1 to obtain more inversion formulae in M n (H).
4.1.The Frobenius algorithm over M n (C) and its optimality.For notational simplicity, we define We describe formulae in Lemma 4.1 (a)-(b) as a pseudocode in Algorithm 1.It is clear that Algorithm 1 costs 2 inversions and 3 multiplications in M n (C).Next we analyze the optimlity of Algorithm 1.To that end, we need the following two lemmas.

Algorithm 1 complex Frobenius inversion
For any field k and A, B ∈ M n (k) such that both A and A+BA −1 B are invertible, it requires at least two matrix inversions to compute (A + BA −1 B) −1 , no matter how many matrix additions, matrix multiplications or scalar multiplications in k are used.Lemma 4.3.Let p 1 , p 2 , q 1 , q 2 be polynomials over C in n variables.Assume that p 1 (a)q 2 (a) = p 2 (a)q 1 (a) for any a = (a 1 , . . ., a n ) ∈ R n such that q 1 (a) = 0 and q 2 (a) = 0. Then p 1 q 2 = p 2 q 1 .Proof.We observe that any polynomial f over C can be uniquely written as f = Re(f ) + i Im(f ), where Re(f ) and Im(f ) are polynomials with real coefficients.We assume that p 1 q 2 − p 2 q 1 = 0. Then it is clear that either Re(p 1 q 2 − p 2 q 1 ) = 0 or Im(p 1 q 2 − p 2 q 1 ) = 0. Without loss of generality, we may assume that Re(p 1 q 2 −p 2 q 1 ) = 0. Similarly we may also assume Re(q 1 ) = 0 and Re(q 2 ) = 0. Since R is an infinite field, there exists a = (a 1 , • • • , a n ) ∈ R n , such that (Re(p 1 q 2 − p 2 q 1 ) Re(q 1 ) Re(q 2 )) (a) = 0.
Hence (p 1 /q 1 − p 2 /q 2 )(a) = 0, q 1 (a) = 0, q 2 (a) = 0, which contradicts to the assumption.Now we are ready to prove the optimality of Algorithm 1.To simplify the notation, we define According to Algorithm 1, we may also assume that Z 1 is invertible so that Z −1 = W 1 − jW 2 where Here X denotes the complex conjugate of a complex matrix X.
Theorem 4.4.It requires at least 1 addition and 2 inversions to compute . In particular, Algorithm 1 is optimal in the sense of the least number of complex matrix inversions.
We remark that Theorem 4.4 should be understood as follows: 2 is the minimum number of complex matrix inversions required to compute Z −1 , regardless of the number of additions, matrix multiplications, scalar multiplications and conjugations used.
Proof. 1 addition is obviously necessary, thus it is sufficient to prove the necessity of 2 inversions.We proceed by contradiction.Assume that there exists an algorithm which costs only one inversion.Then we may find complex noncommutative polynomials f and g such that Clearly, one can further find complex noncommutative polynomials h and φ so that (11) may be simplified as follows.
Next we consider the rational map Φ : In particular, elements of the matrix Φ(Z 1 , Z 2 ) are rational functions on M n (C) × M n (C).By (12), we see immediately that Φ ≡ 0 on M n (R) × M n (R) whenever it is defined.Therefore Lemma 4.3 implies that elements of Φ must vanish on M n (C)×M n (C) whenever they are defined.In particular, Φ(Z 1 , Z 2 ) ≡ 0 on M n (C) × M n (C) whenever it is defined, from which we may conclude that can be computed by one complex matrix inversion.This leads to a contradiction to Lemma 4.2.

4.2.
The Frobenius algorithm over M n (R) and its complexity.According to [9], an inversion for a generic element in M n (C) costs 2 inversions and 3 multiplications in M n (R), and a multiplication in M n (C) costs 3 multiplications in M n (R).Therefore, the total cost of inverting a generic Z ∈ M n (H) by Algorithm 1, together with algorithms in [9] for multiplication and inversion in M n (C), is at most 4 inversions and 15 multiplications in M n (R).However, we observe that there are actually two repeated multiplications, thus we may obtain Algorithm 2 and the theorem that follows.

Comparison with existing methods
In this section, we compare the computational overhead of the Generalized Frobenius Inversion (Algorithms 1 and 2) with existing methods for quaternion matrix inversion.5.1.Four Alternative Methods.As far as we are aware, there have been only four methods for quaternion matrix inversion previously discussed in the literature.Before we proceed further, we briefly summarize these methods.5.1.1.Complex Method.Let ϕ 1 : M n (H) → M 2n (C) be the injective map defined by Clearly ϕ 1 is a homomorphism of algebras over R which sends a quaternion matrix into a double sized complex matrix.Converting quaternion matrices into complex matrices is a standard technique [47,42], from which we may obtain Algorithm 3.

Algorithm 3 Complex Method
5.1.2.Real Method.We may also convert a quaternion matrix into a quadruple sized real matrix [47,42] via an injective map ϕ 2 : M n (H) → M 4n (R) defined by Again, ϕ 2 is an R-algebra homomorphism which leads to Algorithm 4.

Algorithm 4 Real Method
5.1.3.Skew Real Method.By exploring the block skew-symmetric structure of ϕ 2 (A+iB+jC +kD), a more efficient approach is proposed in [39].For ease of reference, we record the method in Algorithm 5. We remark that the orginal approach in [39] is presented in terms of block matrices, while Algorithm 5 is a simplified and more efficient version.

Algorithm 5 Skew Real Method
5.1.4.Quaternion Toolbox For Matlab (QTFM). .A quaternion inversion algorithm for quaternion matrix inversion is implemented in the QTFM [33].The main idea behind this approach is to partition a matrix Z ∈ M n (H) into Then the inverse of Z can be computed by the Schur complement [22] and one can repeat the procedure to obtain Algorithm 6 for Z −1 .The key feature of Algorithm 6 that distinguishes it from Algorithms 1-5 is that operations in Algorithm 6 are all over H, rather than C or R.

Algorithm 6 toolbox
inversion in M n/2 (H)

Comparison.
In this subsection, we compare computational complexities of Algorithms 1-6.We denote by inv n,F (resp.mult n,F , add n,F ) the inversion (resp.multiplication, addition) in M n (F), where F = R, C or H.In Table 1, we record operations required by each of Algorithms 1-6 in the column labelled by "operations".
Let A be an algorithm for matrix multiplication over H.We denote by c n,H , c n,C and c n,R the complexity of A restricted to M n (H), M n (C) and M n (R) respectively.Theorem 5.1.If we compute mult n,H by A and compute inv n,C by the LU-decomposition method, then the complexity (ignoring the lower order terms) of Algorithms 1-6 is as shown in the column of Table 1 with label "complexity".In particular, if A is the usual algorithm3 for matrix multiplication, and if operations in H and C are computed by usual methods4 over R, then among Algorithms 1-6, Algorithm 2 is optimal.
Proof.We recall that inverting an n × n matrix by the LU-decomposition over C (resp.R) costs (ignoring lower order terms) 4n 3 3 (mult 1,C + add 1,C ) (resp.4n 3 3 (mult 1,R + add 1,R )).If A is the usual algorithm for matrix multiplication, then we have where F = H, C or R.Moreover, if we compute operations in H and C in the usual way, then we also have This implies that Algorithms 1-6 respectively cost 136n 3 3 , 110n 3 3 , 256n 3 3 , 512n 3 3 , 128n 3 3 , 128n 3 flops in R, from which we may conclude the optimality of Algorithm 2.

Algorithm
operations complexity We conclude this subsection by a remark on the computation of operations in H and C over R. It is easy to prove that the usual algorithms for add 1,H and add 1,C are already optimal.It is known [23,10] that any non-commutative algorithm for mult 1,H costs at least 8 multiplications in R, yet there is no known algorithm for mult 1,H costs 8 multiplications and less than 32 additions in R. Thus the usual algorithm is in fact the optimal one (in the sense of total cost of multiplications and additions ) among all existing 5 algorithms for mult 1,H .
Similarly, it is well-known [35,45,36,32,5] that any non-commutative algorithm for mult 1,C requires at least 3 multiplications in R and the optimality is achieved by the Gauss algorithm.However, the Gauss algorithm costs 5 additions in R. Therefore, the usual algorithm is most efficient among all known algorithms for mult 1,C .5.3.Experiments.For each integer 1 ≤ m ≤ 50, we take n = 100m and apply Algorithms 1-6 to invert random matrices in M n (H).To be more precise, we generate n × n real matrices A, B, C and D whose elements are uniformly drawn from the interval (−1, 1).Then with probability one, Algorithms 1-6 are all applicable to Z = A + iB + jC + kD ∈ M n (H).For each m, we repeat the above procedure 50 times to obtain 50 samples for our test.
We record in Figure 1 the average running time of each of Algorithms 1-6.For better comparison, we also compute the ratio r n,s = t n,5 t n,s , s ∈ {1, 2, 3, 4, 6}.
Here t n,p denotes the average running time of Algorithm p for instances of size n × n.By definition, r n,s > r n,s indicates that Algorithm s is more efficient than Algorithm s .Figure 2 exhibits how r n,s varies as n grows for each s ∈ {1, 2, 3, 4, 6}.It is clear from Figures 1 and 2 that Algorithm 2 is significantly faster than all other five algorithms.
Let Z be the inverse of Z obtained by one of Algorithms 1-6.We measure the quality of Z by the mean right residual: Here • F is the Frobenius norm on M n (H).We record in Figure 3 the average of the mean right residual of 50 samples for each of Algorithms 1-6.From Figure 3, we may easily conclude that the 5 We reiterate that we are assessing optimality in terms of the set of cubic-complexity quaternion algorithms currently available.Those algorithms, and ours, can of course be applied as the base case within the generic recursions used by the best available Strassen-type algorithm, with the optimal among that set clearly providing the best coefficient on the overall subcubic complexity.

Discussion
In this paper we have presented a new algorithm for efficiently inverting quaternion matrices, and we have provided empirical results showing that it is the fastest among quaternion inversion algorithms found in the literature.Consequently, it represents the optimizing recursive base case for Strassen-type inversion algorithms to achieve the best combination of computational complexity and practical performance for inverting quaternion matrices.

Figure 1 .
Figure 1.Algorithm 2 (blue) incurs significantly less computational overhead compared to other five algorithms.

Figure 2 .
Figure 2. The ratio (blue) of Algorithm 2 is significantly greater than ratios of other algorithms.

Figure 3 .
Figure 3.The mean right residual of Algorithm 1 is at most 5 • 10 −13 for all but one dimensions.

Table 1 .
n inv 1,H + comparison of complexities Corollary 5.2.Let A be an algorithm for matrix multiplication that is faster than the usual algorithm.If we compute operations in H and C by usual methods over R, then Algorithm 2 is still optimal among Algorithms 1-6.