Two Public-Key Cryptosystems Based on Expanded Gabidulin Codes

This paper presents two public key cryptosystems based on the so-called expanded Gabidulin codes, which are constructed by expanding Gabidulin codes over the base field. Exploiting the fast decoder of Gabidulin codes, we propose an efficient algorithm to decode these new codes when the noise vector satisfies a certain condition. Additionally, these new codes have an excellent error-correcting capability because of the optimality of their parent Gabidulin codes. With different masking techniques, we give two encryption schemes by using expanded Gabidulin codes in the McEliece setting. Being constructed over the base field, these two proposals can prevent the existing structural attacks using the Frobenius map. Based on the distinguisher for Gabidulin codes, we propose a distinguisher for expanded Gabidulin codes by introducing the concept of the so-called twisted Frobenius power. It turns out that the public code in our proposals seems indistinguishable from random codes under this distinguisher. Furthermore, our proposals have an obvious advantage in public key representation without using the cyclic or quasi-cyclic structure compared to some other code-based cryptosystems. To achieve the security of 256 bits, for instance, a public key size of 37583 bytes is enough for our first proposal, while around 1044992 bytes are needed for Classic McEliece selected as a candidate of the third round of the NIST PQC project.


Introduction
Over the past decades, cryptosystems based on coding theory have been drawing more and more attention due to the rapid development of quantum computers. The first code-based cryptosystem, known as McEliece cryptosystem [1] based on Goppa codes, was proposed by McEliece in 1978. The principle for McEliece's proposal is to first encode the plaintext with a random generator matrix of the distorted Goppa code and then add some random errors. Since then various studies [2][3][4][5][6] have been made to investigate the security of McEliece cryptosystem.
In addition to potential resistance against quantum computer attacks, McEliece cryptosystem has pretty fast encryption and decryption procedures. However, this system has never been used in practice due to the large public key size. To overcome this problem, some variants were proposed one after another. For instance, the authors in [7] proposed to use the automorphism groups of Goppa codes to build decodable error patterns of larger weight, which greatly enhances the system against decoding attacks. By doing this, smaller codes are allowed in the design of encryption schemes to reduce the public key size. Unfortunately, this variant was shown to be vulnerable against the chosen-plaintext attacks proposed in [8]. In [9], the authors proposed the family of quasi-dyadic Goppa codes, which admit a very compact representation of parity-check or generator matrix, for efficiently designing syndrome-based cryptosystems. However, the authors in [10] mounted an efficient key-recovery attack against this variant for almost all the proposed parameters.
Besides endowing Goppa codes with some special structures, replacing Goppa codes with other families of codes is another approach to shorten the public keys. For instance, Niederreiter [11] introduced a knapsack-type cryptosystem based on generalized Reed-Solomon (GRS) codes. In Niederreiter's proposal, the message sender first converts the plaintext into a vector of fixed weight and then multiplies it with a parity-check matrix of the public code. The advantage of GRS codes consists in their optimal error-correcting capability, which enables us to reduce the public key size by exploiting codes with smaller parameters. However, this variant was proved to be insecure by Sidelnikov and Shestakov in [12] for the reason that GRS codes are highly structured. But if we use Goppa codes in the Niederreiter setting, it was proved to be equivalent to McEliece cryptosystem in terms of security [15]. To strengthen resistance against structural attacks, the authors in [16] performed a column-mixing transformation instead of a simple permutation to the underlying GRS code. According to their analysis, this variant could prevent some well-known attacks, such as Sidelnikov-Shestakov attack [12] and Wieschebrink's attack [17]. However, in [18] the authors presented a polynomial key-recovery attack in some cases. Although one can adjust the parameters to prevent such an attack, it would introduce some other problems, such as the decryption complexity increasing dramatically and a higher request of error-correcting capability for the underlying code. In [19], the authors introduced the concept of expanded GRS codes and designed an encryption scheme by using these codes in the Niederreiter setting. However, this scheme was already partially broken by Couvreur and Lequesne in [14] for the case of λ = 2 and m = 3.
In [21] Gabidulin introduced a new family of rank metric codes, known as Gabidulin codes, which can be seen as an analogue of GRS codes in the rank metric. The particular appeal of rank metric codes is that the general decoding problem is much more difficult than that of Hamming metric codes [22,23]. This inspires us to obtain a much smaller public key size by building cryptosystems in the rank metric. In [24] the authors proposed the GPT cryptosystem by using Gabidulin codes in the McEliece setting, which requires a public key size of only a few thousand bits for the security of 100 bits. Similar to the cryptosystems based on GRS codes, the GPT cryptosystem and some of its variants [25][26][27][28] have been subjected to many structural attacks [29][30][31][32]. Faure and Loidreau proposed another cryptosystem [34] that is quite different from the GPT proposal. The security of this scheme closely relates to the intractability of reconstructing linearized polynomials. Until the work in [35], the Faure-Loidreau scheme had never been severely attacked. In [36] Loidreau designed another rank metric based cryptosystem in the McEliece setting, where a column-mixing transformation was imposed to the underlying code with the inverse of an invertible matrix whose entries are taken from an F q -subspace of F q m of dimension λ. Loidreau claimed that their proposal could prevent all the existing structural attacks. However, this claim was proved to be invalidated by the authors in [38] for the case of λ = 2 and the code rate being greater than 1/2. Not long after this, the author in [39] extended this attack to the case of λ = 3.
In [33], Berger et al. used the so-called Gabidulin matrix codes to design cryptosystems, which can be seen as a rank metric counterpart of expanded GRS codes. Our work in the present paper is inspired by the variants [19,33] and uses the so-called expanded Gabidulin codes as the underlying code. Benefitting from the optimality of their parent Gabidulin codes, these new codes have excellent capability of correcting Hamming errors. This enables us to reduce the public key size by exploiting smaller codes. Because of our proposals being constructed over the base field, all the existing structural attacks based on the Frobenius map do not work any longer.
The rest of this paper is arranged as follows. In Section 2, notations and some basic concepts about rank metric codes and Gabidulin codes will be given. In Section 3, we shall introduce the so-called expanded Gabidulin codes and propose an efficient algorithm to decode these new codes. Section 4 introduces two hard problems in coding theory, as well as the best known generic attacks on them. Section 5 is devoted to a formal description of our two proposals constructed by using expanded Gabidulin codes in the McEliece setting. Section 6 gives the security analysis of our proposals, including structural attacks and generic attacks. In Section 7, we give some suggested parameters for different security levels and make a comparison on public key size with some other code-based cryptosystems. Following this, we make a few concluding remarks in Section 8.

Preliminaries
In this section we first introduce some notations used throughout this paper, and recall some basic concepts about linear codes and rank metric codes. After that, we will introduce the definition of Gabidulin codes and some related results.

Notations and basic concepts
Let q be a prime power. Denote by F q the finite field with q elements, and F q m an extension field of F q of degree m. For two positive integers k and n, let M k,n (F q ) denote the space of all k × n matrices over F q , and GL n (F q ) the general linear group of all invertible matrices in M n,n (F q ). For a matrix M ∈ M k,n (F q ), let M Fq be the vector space spanned by the rows of M over F q .
An [n, k] linear code C over F q is a k-dimensional subspace of F n q . An element of C is called a codeword of C. The dual code of C, denoted by C ⊥ , is the orthogonal space of C under the usual inner product over F n q . A matrix G is called a generator matrix of C if its rows form a basis of C. A generator matrix H of C ⊥ is called a parity-check matrix of C. For a codeword c ∈ C, the Hamming weight of c, denoted by w H (c), is the number of nonzero components of c. The minimum Hamming distance of C, denoted by d H (C), is defined as the minimum Hamming weight of nonzero codewords in C. The minimum Hamming distance of C has n − k + 1 as an upper bound, and we call C Maximum Distance Separable (MDS) when d H (C) reaches to this bound.

Rank metric codes
Now we recall some basic concepts about rank metric codes.
Definition 1. For a vector x ∈ F n q m , the rank support of x, denoted by Supp(x), is defined as the linear space spanned by the components of x over F q .
Definition 2. For a vector x ∈ F n q m , the rank weight of x, denoted by w R (x), is defined as the dimension of Supp(x) over F q .
Definition 3. For two vectors x, y ∈ F n q m , the rank distance between x and y, denoted by d R (x, y), is defined to be the rank weight of x − y.
Definition 4. For a linear code C ⊆ F n q m , the minimum rank distance of C, denoted by d R (C), is defined to be the minimum rank weight of nonzero codewords in C.
A linear code endowed with the rank metric is called a rank metric code. Similar to Hamming metric codes, the minimum rank distance of a rank metric code is bounded from above by the Singleton-type bound as described in the following proposition.
Proposition 1 (Singleton-type bound). [37] For positive integers k n m, let C be an [n, k] rank metric code over F q m , then the minimum rank distance of C with respect to F q satisfies the following inequality d R (C) n − k + 1.
Remark 1. A rank metric code attaining the Singleton-type bound is called a Maximum Rank Distance (MRD) code. Let C ⊆ F n q m be an [n, k] MRD code and c ∈ C be a nonzero codeword with w H (c) = d H (C). It is clear that n − k + 1 w R (c) w H (c) n − k + 1. Then w H (c) = n − k + 1, which implies that an MRD code is MDS in the Hamming metric.

Gabidulin codes
For an integer i, we denote by [i] = q i the i-th Frobenius power. Under this notation, α q i can be simply written as α [i] for any α ∈ F q m . For a vector v ∈ F n q m , we denote by v [i] the i-th component-wise Frobenius power of v. For a linear code C ⊆ F n q m , the i-th Frobenius power of C is defined as Definition 5 (Gabidulin codes). For positive integers k n m, let g = (g 1 , . . . , g n ) ∈ F n q m with w R (g) = n. The [n, k] Gabidulin code Gab n,k (g) generated by g is defined to be a linear code having a generator matrix of the form Similar to GRS codes in the Hamming metric, Gabidulin codes also admit an excellent error-correcting capability and simple algebraic structure. The following two theorems describe some properties of Gabidulin codes.

Theorem 6.
[40] A Gabidulin code is an MRD code. In other words, the minimum rank weight of Gab n,k (g) attains the Singleton-type bound.
This implies that the Gabidulin code Gab n,k (g) can theoretically correct up to ⌊ n−k 2 ⌋ rank errors, which is an important reason for Gabidulin codes being widely used in the design of cryptosystems.
The dual of a Gabidulin code is also a Gabidulin code. Specifically, we have Gab n,k (g)

Expanded Gabidulin codes
In this section, we first introduce the definition of expanded Gabidulin codes, then investigate some of their algebraic properties. After that, we propose an efficient algorithm to decode these codes.

Expanded Gabidulin codes
Note that F q m can be viewed as an F q -linear space of dimension m. Let B = (α 1 , . . . , α m ) be a basis of F q m over F q . For any α ∈ F q m , there exists (a 1 , . . . , a m ) ∈ F m q such that α = m i=1 a i α i . Based on this observation, we define an F q -linear isomorphism from F q m to F m q with respect to B as follows For convenience, we need to introduce a matrix representation of this transformation in [14,33].
Now we give an effective method to perform this operation, which is based on the following theorem.
It is easy to verify that G forms an [nm, km] linear code over F q . The following proposition gives a method of constructing a generator (parity-check) matrix of an expanded Gabidulin code when a generator (parity-check) matrix of its parent Gabidulin code is known.
For a basis B = (α 1 , . . . , α m ) of F q m over F q , let G be the expanded code of G induced by φ B . Then we have the following conclusions.
(1) Let G = g T 1 , . . . , g T k T be a generator matrix of G, then G has an mk × mn generator matrix of the form We call such Φ B (G) a normal generator matrix of G.
(2) Let H = h T 1 , . . . , h T n be a parity-check matrix of G, then G has an m(n − k) × nm parity-check matrix of the form Note that Gabidulin codes are optimal in both the Hamming metric and rank metric. However, expanded Gabidulin codes are far from optimal in the Hamming metric. Specifically, we have the following proposition. Proposition 3. Let G be an [n, k] Gabidulin code over F q m . For a basis B of F q m over F q , denote by G the expanded code of G induced by φ B . Then the minimum Hamming distance of G satisfies the following inequality In particular, with a proper choice of B, the minimum Hamming distance of G can reach to n − k + 1.
Proof. For any u ∈ G, there exists u = (u 1 , . . . , u n ) ∈ G such that u = φ B (u). Since G is MDS in the Hamming metric, then w H (u) n−k+1 for a nonzero u.
On the other hand, by the Singleton bound for Hamming metric codes, it is clear that

Decoding expanded Gabidulin codes
As for Gabidulin codes, several efficient decoding algorithms [21,41,42] already exist. Now we investigate the decoding problem of expanded Gabidulin codes. Our analysis shows that when the noise vector satisfies a certain condition, decoding an expanded Gabidulin code can be converted into decoding the parent Gabidulin code.
Let G ⊆ F n q m be an [n, k] Gabidulin code having H as a parity-check matrix. Let B be a basis of F q m over F q , we denote by G an expanded code of G induced by φ B . Let y = c + e be the received word, where c ∈ G and e = (e 1 , . . . , e n ) ∈ F mn q is the noise vector with e j = (e 1j , . . . , e mj ) ∈ F m q . Let E ∈ M n,m (F q ) be a matrix whose j-th row is e j , called the error matrix corresponding to e. If the following inequality holds then we say e satisfies the decodable condition. In this situation, we can obtain a fast decoder D G for G to decode y by exploiting the syndrome decoder of G.
Denote by H a parity-check matrix of G. It is easy to see that where e * = (e * 1 , . . . , e * n ) ∈ F n q m with e * j = m i=1 e ij α i . It is clear that e * = BE T , then w R (e * ) = Rank(E) ⌊ n−k 2 ⌋. Applying the decoder of G to φ −1 B (y H T ) = e * H T will lead to e * , then we can recover e by computing φ B (e * ).
Apparently four steps are needed to decode expanded Gabidulin codes. Firstly, we shall compute the syndrome of the received word y, which requires an operation of multiplying y and H T together with a complexity of O(m 2 n(n − k)) in F q . Secondly, we shall perform the inverse transformation of φ B to the syndrome obtained in the first step, requiring a complexity of O(mn) in F q m . The third step shall call the fast decoder of the parent Gabidulin code to obtain an error vector e * with w R (e * ) ⌊ n−k 2 ⌋, which requires a complexity of O( 5 2 n 2 − 3 2 k 2 ) in F q m [41]. In the last step, we shall compute φ B (e * ) through the method described in Remark 2 with a complexity of O(((m − 1)(q − 1) + 1)mn) in F q m . Finally the total complexity of decoding expanded Gabidulin codes is O(m 2 n(q − 1) + mn(3 − q)

Two hard problems
The security of our two proposals in this paper mainly involves two hard problems in coding theory, namely the rank syndrom decoding (RSD) problem and MinRank problem. In this section, we will give a description of these two problems and some well known attacks on them.

RSD problem
Definition 10 (RSD problem). Let H ∈ M n−k,n (F q m ) be a matrix of full rank, s ∈ F n−k q m and t be a positive integer. An RSD instance R(q, m, n, k, t) is to solve s = eH T for e ∈ F n q m such that w R (e) t.
The RSD problem plays a crucial role in rank metric based cryptography. Although this problem is not known to be NP-complete, it is believed to be hard by the community. Up to now, the best known combinatorial attacks on this problem can be found in [22,[43][44][45].
In what follows, we will recall the principle of the combinatorial attack proposed in [44]. Although there are some improvements [45] for this attack, they are not applicable to our proposals.
To make the description concise, here we introduce a notation used in the sequel. For positive integers w < v < u, by P Fq (u, v, w) we denote the probability that a random space of dimension v in a space of dimension u contains a given space of dimension w. Using the Gaussian binomial, we have For an RSD instance R(q, m, n, k, t), we consider the following two cases to solve the problem.
Case 1: n > m. Let B be a basis of F q m over F q , then there exists E ∈ M m,n (F q ) with Rank(E) t such that e = BE. Let E = E T Fq and V ⊆ F m q be an F q -linear space of dimension t ′ t. If E ⊆ V, then one can express e in a basis of V over F q . By computing s = eH T and expanding this system over the base field, one obtains a linear system of m(n − k) equations and nt ′ variables over F q . To have only one solution with overwhelming probability, one needs nt ′ m(n − k), then t ′ m − km n . By taking t ′ = m − km n , one gets a complexity of O(m 3 (n − k) 3 /p) in F q , where p = P Fq (m, t ′ , t).
Case 2: n m. Let B be a basis of F q m over F q , then there exists E ∈ M m,n (F q ) with Rank(E) t such that e = BE. Let E = E Fq and V ⊆ F n q be an F q -linear space of dimension t ′ t. If E ⊆ V, then one can express E in a basis of V over F q . By computing s = eH T and expanding this system over the base field, one obtains a linear system of m(n − k) equations and mt ′ variables over F q . To have only one solution with overwhelming probability, one needs mt ′ m(n − k), then t ′ n − k. By taking t ′ = n − k, one gets a complexity of O (m 3 (n − k) 3 /p) in F q , where p = P Fq (n, t ′ , t).

MinRank problem
Definition 11 (MinRank problem). For a finite field F q and positive integers m, n, k, t, let M 1 , . . . , M k ∈ M m,n (F q ). A MinRank instance of parameters (q, m, n, k, t) is to search for x 1 , . . . , The MinRank problem was first introduced and proven NP-complete by Buss et al. in [46]. This problem is of great importance in both multivariate cryptography [48] and rank metric based cryptography [44]. In Table 1, we give the best algebraic attacks in [47] on the MinRank problem, where ω = 2.8 is the linear algebra constant.

Description of our proposals
Now we give a formal description of our two proposals.

Proposal I
For a given security level, choose a finite field F q and positive integers k < n m and λ such that m(n−k) n < λ < m. Let K = λn − m(n − k) and N = λn. For 0 j n − 1, we define I j = {mj + 1, . . . , mj + λ} and let S = ∪ n−1 j=0 I j . Now we give a formal description of our first proposal through the following three procedures.
• Key generation Let G ⊆ F n q m be an [n, k] Gabidulin code. Randomly choose a basis B = (α 1 , . . . , α m ) of F q m over F q and let G = φ B (G). Denote by H a parity-check matrix of G of the form (2), and H S a submatrix of H from the columns indexed by S. Let G S = H S ⊥ Fq and G S be a generator matrix of G S . Randomly choose A ∈ GL λ (F q ) and set T = I n ⊗ A, where I n is the identity matrix of order n. Randomly choose M ∈ GL K (F q ) such that G pub = M G S T −1 is of systematic form. If such an M does not exist, then repeat the process above. The public key is (G pub , t) where t = ⌊ n−k 2 ⌋, and the secret key is ( H S , A, D G ) where D G is the fast decoder of G.

• Encryption
For a plaintext x ∈ F K q , randomly choose E ∈ M n,λ (F q ) with Rank(E) = t. Let e = (e 1 , . . . , e n ) ∈ F N q , where e i is the i-th row vector of E. The ciphertext corresponding to x is computed as y = xG pub + e.

• Decryption
For a ciphertext y ∈ F N q , let e ′ = eT and compute Applying D G to s will lead to a vector e ′′ ∈ F mn q . The restriction of e ′′ to S will be e ′ , then we can recover e by computing e ′ T −1 . The plaintext will be the restriction of y − e to the first K coordinates.

Correctness of Decryption. Let
Define E ′ ∈ M n,λ (F q ) to be a matrix whose i-th row vector is e ′ i . Let e ′′ = (e ′′ 1 , . . . , e ′′ n ), where e ′′ i = (e ′ i , 0) and 0 denotes the zero vector of length m − λ. Define E ′′ ∈ M n,m (F q ) to be a matrix whose i-th row vector is e ′′ i . It is easy to see that Then which implies that e ′′ satisfies the decodable condition described in Section 3.2. Applying the fast decoder of G to s = e ′ H T S = e ′′ H T will lead to e ′′ , then the restriction of e ′′ to S will be e ′ .

Proposal II
For a given security level, choose a finite field F q and positive integers λ ≪ k < n m. Let K = km and N = nm. Now we give a formal description of our second proposal through the following three procedures.
• Key generation Let G ⊆ F n q m be an [n, k] Gabidulin code. Randomly choose a basis B = (α 1 , . . . , α m ) of F q m over F q , and let G = φ B (G) be an [N, K] expanded code of G. Let G be a generator matrix of G, and H a parity-check matrix of the form (2). Let u f = ⌊ n λ ⌋, u c = ⌈ n λ ⌉ and v = n − λu f . Randomly choose A ∈ GL mλ (F q ) such that the mv × mv submatrix A sub in the top left corner of A is invertible. Let where A ten is the tensor product I u f ⊗ A. If the first K columns of GT −1 are linearly independent over F q , then choose M ∈ GL K (F q ) to convert G pub = M GT −1 into systematic form. Otherwise, one rechooses the matrix T . Then the public key is (G pub , t) where t = ⌊ n−k 2λ ⌋, and the private key is ( H, A, D G ) where D G is the fast decoder of G.
such that Rank(E) = t, where e i ∈ F m q for 1 i n. Let e = (e 1 , . . . , e n ) ∈ F N q , then the ciphertext corresponding to x is computed as y = xG pub + e.

• Decryption
For a ciphertext y ∈ F N q , compute s = yT H T = eT H T . Applying D G to s will lead to e ′ = eT , then one can recover e by computing e ′ T −1 . The restriction of y − e to the first K coordinates will be the plaintext.
Correctness of Decryption. First, we introduce the following proposition.
then Rank(F ) Rank(F ′ ) λt ⌊ n−k 2 ⌋ by Proposition 4. Remark 3. The cryptosystem presented above deals with the general situation where λ does not divide n, or equivalently u f = u c . As for the case of u f = u c , just a few changes are needed in the key generation procedure. To generate the column scrambling matrix T −1 , any A ∈ GL mλ (F q ) is feasible for computing T = I u f ⊗ A.

Security analysis
This section mainly discusses the security of the proposed cryptosystems. Attacks on codebased cryptosystems can be divided into two categories, namely structural attacks and generic attacks. A structural attack aims to recover the structure of the underlying code from the published information, which amounts to recovering the private key or its equivalent form that can be used to decrypt any valid ciphertext in polynomial time. A generic attack is to recover the plaintext directly without knowing the private key, which implies that one has to deal with the underlying hard problem.
In the remainder of this section, we will first introduce a distinguisher for Gabidulin codes and expalin why our proposals can prevent the existing structural attacks. Following this, we introduce the concept of twisted Frobenius power and build a distinguisher for expanded Gabidulin code, which provides an approach for us to distinguish expanded Gabidulin codes from random linear codes. After that, we will investigate the practical security of our proposals from two aspects.

Existing structural attacks
Now we describe some properties of Gabidulin codes under the Frobenius map, which will be useful for us to explain why our proposals can prevent the related structural attacks. The following two propositions provide an approach for us to distinguish Gabidulin codes from general ones.
Most cryptosystems based on Gabidulin codes have been shown to be insecure due to their vulnerability against structural attacks, such as Overbeck's attack [30], Coggia-Couvreu attack [38] and the attack proposed in [35]. Although these attacks were designed to cryptanalyze different variants, most of them rely on the fact that one can distinguish Gabidulin codes from general ones by observing how their dimensions behave under the Frobenius map according to Proposition 5 and 6. However, this property is no longer valid when considering our proposals. Since our proposals are built over the base field F q , it is clear that G [i] = G for any integer i. In this situation, Gabidulin codes will be indistinguishable from random codes. Hence it is reasonable to conclude that all these attacks do not work on our proposals.

A distinguisher for expanded Gabidulin codes
Let G ⊆ F n q m be an [n, k] Gabidulin code, and G an expanded code of G with respect to a basis B of F q m over F q . Given a generator matrix of G, Berger et al. [33] proposed an efficient approach to compute G [s] for some 0 s m − 1, which is also an [n, k] Gabidulin code. A key point is that one can recover a normal generator matrix of G, which can be done by reducing any generator matrix of G into systematic form. Hence the direct application of expanded Gabidulin codes will lead to an insecure scheme. Indeed, one can obtain more information from a generator matrix of G. For instance, one can recover the expanded code of G [s] without knowing the basis B. To do this, we need to introduce the concept of twisted Frobenius power.
Let G ∈ M K,N (F q ), where K = km and N = nm. For 1 i k and 1 j n, let I i = {(i − 1)m + 1, . . . , im} and J j = {(j − 1)m + 1, . . . , jm}. Denote by G ij the submatrix of G from the rows indexed by I i and columns indexed by J j . For a positive integer s, we define G (s) = (G q s ij ) to be the s-th twisted Frobenius power of G, where G q s ij denotes the usual q s -th power of G ij . For a sequence s = (1, 2, . . . , N), we partition s into n blocks, each of which has length m. Let C be an [n, k] linear code over F q m , and C an [N, K] expanded code of C. Let I 1 , . . . , I k be k blocks of s, if I = ∪ k i=1 I i forms an information set of C, then we call I a block information set of C. It is clear that C admits at least one block information set. In the sequel, when we talk about a block information set of a linear code, we always mean the first one in lexicographic order. Let I be a block information set of C, and G I ∈ M K,N (F q ) a generator matrix of C, where the submatrix of G I from the columns indexed by I forms an identity matrix of order K. Then we define the s-th twisted Frobenius power of C as C (s) = G (s) I Fq . It is easy to see that C (s) is an expanded code of C [s] with respect to B and does not rely on the choice of the block information set.
The concept of twisted Frobenius power actually provides an approach for us to compute the expanded code of C [s] even if the corresponding basis is not known. Furthermore, we have the following two propositions, which describe an effective distinguisher for expanded Gabidulin codes from an expanded code of a random one. Proposition 7. Let G ⊆ F n q m be an [n, k] Gabidulin code, and G ⊆ F N q an expanded code of G with respect to a basis B of F q m over F q . For any positive integer i, the following equality holds dim( G + G (1) + · · · + G (i) ) = min{N, (k + i)m}.
It follows that which leads to the conclusion immediately from Propostion 5.
Proposition 8. Let C ⊆ F N q be an expanded code of an [n, k] random code C ⊆ F n q m . For any positive integer i, the following equality holds with high probability dim( C + C (1) + · · · + C (i) ) = min{N, k(i + 1)m}.
Proof. Similar to the proof of Proposition 7, the conclusion is obtained immediately from Proposition 6.
An expanded code always has a block information set. However, a random code may not have one, such as a linear code C generated by a matrix G that has a full-zero column in each block. If a linear code has a block information set, then we define its twisted Frobenius power following the way described above. Otherwise, we choose a generator matrix of C at random, say G, and compute C (s) = G (s) Fq . In both situations, however, C (s) generally depends on the choice of the block information set or the generator matrix G. Furthermore, we have a heuristic from Proposition 8 described as follows, which has been verified practically through numerous experiments.
Heuristic. Let C ⊆ F N q be an [N, K] random code, where N = nm and K = km. For any positive integer i, the following equality holds with high probability dim( C + C (1) + · · · + C (i) ) = min{N, k(i + 1)m}.
With this heuristic and Propositions 7,8, we actually build a distinguisher for expanded Gabidulin codes. By definition, the applicable condition for computing the twisted Frobenius power of a code is that both the length and dimension should be multiples of the extension degree. In our implementation, we choose m = n and therefore this applicable condition is satisfied. According to our experimental results, both the public code and its dual in our two proposals behave more like a random code. Therefore, we believe that these two proposals can prevent a potential attack based on this distinguisher.

Generic attacks
In this section, we will show that decrypting a valid ciphertext in our proposals can be converted into solving a MinRank instance. To facilitate the description of problems, we introduce an F q -linear isomorphism σ n from F ns q to M n,s (F q ). For a vector x = (x 1 , . . . , x n ) ∈ F ns q with x i ∈ F s q , we define σ n (x) as For a set X ⊆ F ns q , we define σ n (X ) = {σ n (x) : x ∈ X }. For any x ∈ F ns q , by w R (x) we mean the usual rank of σ n (x) hereafter when no ambiguity arises.
Reducing Proposal I to the MinRank problem. In Proposal I, K = λn − m(n − k), N = λn and t = ⌊ n−k 2 ⌋. Let y = c + e be a valid ciphertext in Proposal I, where c ∈ G pub and e ∈ F N q with w R (e) t. Let M 0 = σ n (y), and M i = σ n (m i ) where m i denotes the i-th row of G pub for 1 i K. Then recovering e can be reduced to a MinRank instance of parameters (q, n, λ, K + 1, t), that is to search for a 0 , a 1 , . . . , a K ∈ F q such that Reducing Proposal II to the MinRank problem. In Proposal II, K = km, N = nm and t = ⌊ n−k 2λ ⌋. Let y = c + e be a valid ciphertext in Proposal II, where c ∈ G pub and e ∈ F N q with w R (e) λt. Let M 0 = σ n (y), and M i = σ n (m i ) where m i denotes the i-th row of G pub for 1 i K. Then recovering e can be reduced to a MinRank instance of parameters (q, n, m, K + 1, λt), that is to search for a 0 , a 1 , . . . , a K ∈ F q such that Rank( K i=0 a i M i ) λt.

Combinatorial attacks
Based on the idea of combinatorial attacks on the RSD problem described in Section 4.1, we now propose a combinatorial attack to evaluate the practical security our two proposals.
Proposal I. In the case of n > λ, let E = E Fq and V ⊆ F λ q be an F q -space of dimension t ′ t. If E ⊆ V, then one can express E in a basis of V over F q . With the same analysis as Case 1 in Section 4.1, one can obtain a linear system of m(n − k) equations with nt ′ variables in F q . Let m(n − k) nt ′ , then t ′ m − ⌈ km n ⌉. By taking t ′ = m − km n , one gets a complexity of O m 3 (n − k) 3 q t(λ−m+⌈ km n ⌉) . In the case of n λ, let E = E T Fq and V ⊆ F n q be an F q -space of dimension t ′ t. If E ⊆ V, then one can express E in a basis of V over F q . With the same analysis as Case 2 in Section 4.1, one can obtain a linear system of m(n − k) equations with λt ′ variables in F q . Let m(n − k) λt ′ , then t ′ ⌊ m(n−k) Proposal II. On the one hand, E = e 1 , . . . , e n Fq has dimension at most λt. Because of n m, with the same analysis as Case 2 in Section 4.1, one gets a complexity of O m 3 (n − k) 3 q λtk .
On the other hand, e is obtained from E ∈ M uc,mλ (F q ) with Rank(E) = t. For λ 2, it is clear that u c = ⌈ n λ ⌉ < n < mλ. Let E = E T Fq and V ⊆ F uc q be an F q -space of dimension  Table 4: Public key size (in bytes) and information rate.
We investigate the performance of the proposed cryptosystems in three cases, as summarized in Tables 2, 3 and 4. Taking the public key size as a more important issue, we suggest the parameter sets in Table 4 for the corresponding security level. With such a choice of parameter sets, we make a comparison on public key size with some other code-based cryptosystems in Table 5

Conclusion
This paper presents two cryptosystems by using expanded Gabidulin codes in the McEliece setting. In our proposals, the underlying expanded Gabidulin code is divided into n blocks, with each block having size m. By definition, each block corresponds to one component of the parent Gabidulin code. To destroy this correspondence, in Proposal I we first shorten the expanded Gabidulin code, then perform a column-mixing transformation to each block. In Proposal II, we adopt a rather different column-mixing transformation to the underlying code by mixing λ neighbouring blocks. The greatest advantage of using expanded Gabidulin codes is that all existing structural attacks based on the Frobenius map no longer make sense. Additionally, an effective distinguisher is introduced for expanded Gabidulin codes, and the public code in these two proposals seems indistinguishable from random codes under this distinguisher. Furthermore, our proposals have a clear advantage in public key representation over some other code-based cryptosystems. For instance, we have reduced the public key size by around 96% compared to Classic McEliece entering the third round of the NIST PQC project.