The XZZX Surface Code

We show that a variant of the surface code---the XZZX code---offers remarkable performance for fault-tolerant quantum computation. The error threshold of this code matches what can be achieved with random codes (hashing) for every single-qubit Pauli noise channel; it is the first explicit code shown to have this universal property. We present numerical evidence that the threshold even exceeds this hashing bound for an experimentally relevant range of noise parameters. Focusing on the common situation where qubit dephasing is the dominant noise, we show that this code has a practical, high-performance decoder and surpasses all previously known thresholds in the realistic setting where syndrome measurements are unreliable. We go on to demonstrate the favorable sub-threshold resource scaling that can be obtained by specializing a code to exploit structure in the noise. We show that it is possible to maintain all of these advantages when we perform fault-tolerant quantum computation. We finally suggest some small-scale experiments that could exploit noise bias to reduce qubit overhead in two-dimensional architectures.


I. INTRODUCTION
A large-scale quantum computer must be able to reliably process data encoded in a nearly noiseless quantum system. To build such a quantum computer using physical qubits that experience errors from noise and faulty control, we require an architecture that operates faulttolerantly [1][2][3][4], using quantum error correction to repair errors that occur throughout the computation.
For a fault-tolerant architecture to be practical, it will need to correct for physically relevant errors with only a modest overhead. That is, quantum error correction can be used to create near-perfect logical qubits if the rate of relevant errors on the physical qubits is below some threshold, and a good architecture should have a sufficiently high threshold to be achievable in practice. These fault-tolerant designs should also be efficient, using a reasonable number of physical qubits to achieve the desired logical error rate. The most common architecture for fault-tolerant quantum computing is based on the surface code [5]. It offers thresholds against depolarizing noise that are already high, and encouraging recent results have shown that its performance against more structured noise can be considerably improved by tailoring the code to the noise model [6][7][8][9][10]. While the surface code has already demonstrated promising thresholds, its overheads are daunting [5,11]. For fault-tolerant quantum computing to become practical, we need to design architectures that provide high thresholds against relevant noise models while minimizing overheads through efficiencies in physical qubits and logic gates.
In this paper, we present a highly efficient faulttolerant architecture design that exploits the common structures in the noise experienced by physical qubits. Our central tool is a variant of the surface code [12][13][14] where the stabilizer checks are given by the product XZZX of Pauli operators around each face on a square lattice [15]. This seemingly innocuous local change of basis offers a number of significant advantages over its more conventional counterpart for structured noise models that deviate from depolarizing noise.
We first consider preserving a logical qubit in a quantum memory using this XZZX code. While some 2D codes have been shown to have high error thresholds for certain types of biased noise [7,16], we find that the XZZX code gives exceptional thresholds for all singlequbit Pauli noise channels, matching what is known to be achievable with random coding [17], [18,Theorem 24.6.2]. It is particularly striking that the XZZX code can match the threshold performance of a random code, for any single-qubit Pauli error model, while retaining the practical benefits of local stabilizers and an efficient decoder. Intriguingly, for noise that is strongly biased towards X or Z, we have numerical evidence to suggest that the XZZX threshold exceeds this random coding (hashing) bound, meaning it can correct errors when the noise entropy per qubit exceeds one bit. Thus, this code could potentially provide a practical demonstration of the superadditivity of coherent information [19][20][21][22][23].
We show that these high thresholds persist with efficient, practical decoders, by using a generalization of a matching decoder in the regime where dephasing noise is dominant. In the fault-tolerant setting when stabilizer measurements are unreliable, we obtain thresholds in the biased-noise regime that surpass all previously known thresholds.
Once we have qubits and operations that perform below the threshold error rate, the practicality of scalable quantum computation is determined by the overhead, i.e., the number of physical qubits we need to obtain a target logical failure rate. Along with offering high thresholds against structured noise, we show that architectures based on the XZZX code require very low overhead to achieve a given target logical failure rate. We consider a biased noise model, where dephasing errors occur more frequently than other errors, by a factor η.
(See Sec. III for a precise definition of the noise model.) In general, for large system sizes and low error rates p with noise bias η, the logical failure rate scales like O((p/ √ η) d/2 ), where d is the distance of the code. This improves the logical failure rate by a factor of ∼ η −d/4 meaning we can achieve a target logical failure rate using considerably fewer qubits at large bias. We also show that near-term devices, i.e., small sized systems with error rates near to threshold, the logical failure rate has a quadratically improved scaling with code distance, as O(p d 2 /2 ). This scaling has been demonstrated at infinite bias with a tailored surface code in prior work [8].
Thus, we should expect to achieve low logical failure rates using a modest number of physical qubits for experimentally plausible values of the noise bias where, say 10 < ∼ η < ∼ 1000 [24,25] Finally, we consider fault-tolerant quantum computation with biased noise [26][27][28], and we show that the advantages of the XZZX code persist in this context. We show how to implement low-overhead fault-tolerant Clifford gates by taking advantage of the noise structure as the XZZX code undergoes measurement-based deformations [29][30][31]. With an appropriate lattice orientation, noise with bias η is shown to yield a reduction in the required number of physical qubits by a factor of ∼ log η in a large-scale quantum computation. These advantages already manifest at code sizes attainable using presentday quantum devices.

II. THE XZZX SURFACE CODE
The XZZX surface code is locally equivalent to the conventional surface code [12][13][14]. The stabilizer generators S f are associated with each face and are given by the product of two Pauli-X terms and two Pauli-Z terms as shown in Fig. 1(a). This variant of the surface code was first presented in Ref. [15], and was subsequently considered as a topological memory [33]. To contrast the XZZX surface code with its conventional counterpart, we refer to the latter as the CSS surface code because it is of Calderbank-Shor-Steane type [34,35].
Together with a choice of code, we require a decoding algorithm to determine which errors have occurred and correct for them. We will consider Pauli errors E ∈ P, and we say that E creates a defect at face f if S f E = (−1)ES f . A decoder takes as input the error syndrome (the locations of the defects) and returns a correction that will recover the encoded information with high probability. The failure probability of the decoder decays rapidly with increasing code distance, d, assuming the noise experienced by the physical qubits is below some threshold rate.
Because of the local change of basis, the XZZX surface code responds differently to Pauli errors compared with the CSS surface code. We can take advantage of this difference to design better decoding algorithms. Let us consider the effect of different types of Pauli errors, start- The XZZX surface code. Qubits lie on the vertices of the square lattice. The codespace is the common +1 eigenspace of its stabilizers S f for all faces of the lattice f . (a) An example of a stabilizer S f . Unlike the conventional surface code, all stabilizers take the same form for every face. (b) Pauli-Z errors give rise to string-like errors that align along a common direction, enabling a one-dimensional decoding strategy. (c) The product of stabilizer operators along a diagonal give rise to the symmetries under an infinite bias dephasing noise model [10,32]. (d) Pauli-X errors align along lines with an orthogonal orientation. At finite bias, errors in conjugate bases couple the lines. (e) Pauli-Y errors can be decoded as in Ref. [10].
ing with Pauli-Z errors. A single Pauli-Z error gives rise to two nearby defects. In fact, we can regard a Pauli-Z error as a segment of a string where defects lie at the endpoints of the string segment, and where multiple Pauli-Z errors compound into longer strings, see Fig. 1

(b).
A key feature of the XZZX code that we will exploit is that Pauli-Z error strings align along the same direction, as shown in Fig. 1(b). We can understand this phenomenon in more formal terms from the perspective of symmetries [10,32]. Indeed, the product of face operators along a diagonal such as that shown in Fig. 1(c) commute with Pauli-Z errors. This symmetry guarantees that defects created by Pauli-Z errors will respect a parity conservation law on the faces of a diagonal oriented along this direction. Using this property, we can decode Pauli-Z errors on the XZZX code as a series of disjoint repetition codes. It follows that, for a noise model described by independent Pauli-Z errors, this code has a threshold error rate of 50%.
Likewise, Pauli-X errors act similarly to Pauli-Z errors, but with Pauli-X error strings aligned along the orthogonal direction to the Pauli-Z error strings. In general, we would like to be able to decode all local Pauli errors, where error configurations of Pauli-X and Pauli-Z errors violate the one-dimensional symmetries we have introduced, e.g. Fig. 1(d). As we will see, we can generalize conventional decoding methods to account for finite but high bias of one Pauli operator relative to others and maintain a very high threshold. Optimal code-capacity threshold estimates for surface code variants over all single-qubit Pauli channels. Threshold estimates are found using approximate maximum-likelihood decoding for the XZZX (left) and CSS (right) surface codes. The gray triangle represents a parametrisation of all single-qubit Pauli channels, where the center corresponds to depolarizing noise, the labeled vertices correspond to pure X and Z noise, and the third vertex corresponds to pure Y noise; see Sec. B for details. For the XZZX code, estimates closely match the hashing bound (not shown) for all single-qubit Pauli channels. For the CSS code, estimates closely match the hashing bound for Y -biased noise but fall well below for X-and Z-biased noise. All estimates use code distances d ∈ {13, 17, 21, 25}.
We finally remark that the XZZX surface code responds to Pauli-Y errors in the same way as the CSS surface code. Each Pauli-Y error will create four defects on each of their adjacent faces; see Fig. 1(e). We therefore see that the high-performance decoders presented in Refs. [7,8,10] are readily adapted for the XZZX code in this limit.

III. OPTIMAL THRESHOLDS
The XZZX code has exceptional thresholds for all single-qubit Pauli noise channels. We demonstrate this fact using an efficient maximum-likelihood decoder [36], which gives the optimal threshold attainable with the code for a given noise model. Remarkably, we find that the XZZX surface code achieves code-capacity threshold error rates that closely match the zero-rate hashing bound for all single-qubit Pauli noise channels, and appears to exceed this bound in some regimes.
We define the general single-qubit Pauli noise channel where p is the probability of any error on a single qubit and the channel is parameterized by the stochastic vector r = (r X , r Y , r Z ), where r X , r Y , r Z ≥ 0 and r X + r Y + r Z = 1. The surface of all possible values of r parametrize an equilateral triangle, where the centre point (1/3, 1/3, 1/3) corresponds to standard depolarizing noise, and vertices (1, 0, 0), (0, 1, 0) and (0, 0, 1) correspond to pure X, Y and Z noise, respectively. We also define biased noise channels, which are restrictions of this general noise channel, parameterized by the scalar η; for example, in the case of Z-biased noise, we define η = r Z /(r X + r Y ) where r X = r Y , such that η = 1/2 corresponds to standard depolarizing noise and the limit η → ∞ corresponds to pure Z noise.
We estimate the threshold error rate as a function of r for both the XZZX surface code with boundaries and the CSS surface code with boundaries, see Fig. 9, using a tensor network decoder that gives a controlled approximation to the maximum-likelihood decoder [7,8,36]. Our results are summarized in Fig. 2. We find that the thresholds of the XZZX surface code closely match or slightly exceed (as discussed below), the zero-rate hashing bound for all investigated values of r, with a global minimum p c = 18.7(1)% at standard depolarizing noise and peaks p c ∼ 50% at pure X, Y and Z noise. We find that the thresholds of the CSS surface code closely match this hashing bound for Y -biased noise, where Y errors dominate, consistent with prior work [7,8], as well as for channels where r Y < r X = r Z such that X and Z errors dominate but are balanced. In contrast to the XZZX surface code, we find that the thresholds of the CSS surface code fall well below this hashing bound as either X or Z errors dominate with a global minimum p c = 10.8(1)% at pure X and pure Z noise.
In some cases, our estimates of XZZX surface code thresholds appear to exceed the zero-rate hashing bound; we now investigate this further. For the values of r investigated for Fig. 2, the mean difference between our estimates and the hashing bound is p c − p h.b. = −0.1(3)% and our estimates never fall more than 1.1% below the hashing bound. However, for high bias, η ≥ 100, we observe an asymmetry between Y -biased noise and Zbiased (or, by code symmetry, X-biased) noise. In particular, we observe that, while threshold estimates with Y -biased noise match the hashing bound to within er- , as a function of code distances used in the estimation. Data is shown for biases η = 30, 100, 300, 1000. Threshold estimates exceed the hashing bound in all cases. The gap reduces, in most cases, with sets of greater code distance, but it persists and appears to stabilize for η = 30, 100 and 1000. In both plots, error bars indicate one standard deviation relative to the fitting procedure.
ror bars, threshold estimates with highly-biased Z-noise, significantly exceed the hashing bound. Our results with Z-biased noise are summarized in Fig. 3, where, since thresholds are defined in the limit of infinite code distance, we provide estimates with sets of increasing code distance for η ≥ 30. Although the gap typically reduces, it appears to stabilize for η = 30, 100, 1000, where we find p c − p h.b. = 1.2(2)%, 1.6(3)%, 3.7(3)%, respectively, with the largest code distances; for η = 300, the gap exceeds 2.9% but has clearly not yet stabilized. This evidence for exceeding the hashing bound appears to be robust, but warrants further study.

IV. FAULT-TOLERANT THRESHOLDS
Having demonstrated the remarkable code-capacity thresholds of the XZZX code, we now demonstrate how to translate these high thresholds into practice using a matching decoder [14,37,38]. With this decoder, we find exceptionally high fault-tolerant thresholds, i.e., allowing for noisy measurements, with respect to a biased phenomenological noise model.
The minimum-weight perfect matching algorithm takes a graph with weighted edges and returns a perfect matching using the edges of the input graph such that the sum of the weights of the edges is minimal. We can use this algorithm for decoding by preparing a complete graph as an input to the algorithm where each defect has a corresponding vertex and we weight each edge by the logarithm of the probability that the error model introduced the pair of defects connected by the edge. We note that in the limit of infinite bias, this decoder respects the one-dimensional system symmetries described in Sec. II exactly [10,32]. Moreover, for unbiased noise models we recover the standard matching decoder [14,39]. Details of our decoder are discussed in Sec. C. The edges returned in the perfect matching correspond to the defects that should be locally paired by the correction.
To detect measurement errors we repeat measurements over a long time, and identify defects at space-time locations where two sequential measurements return opposite outcomes [14]. With this framework, measurement errors can be interpreted as string-like errors aligned along the temporal axis where two defects are created at their endpoints. Minimum-weight perfect-matching can then be adapted for fault-tolerant decoding using the prescription given above.
We evaluate fault-tolerant thresholds by finding logical failure rates using Monte Carlo sampling for different system parameters. Unless stated otherwise, we simulate the XZZX code on a d × d lattice with periodic boundary conditions, and we perform d rounds of stabilizer measurements. We regard a given sample as a failure if the decoder introduces a logical error to the code qubits, or if the combination of the error string and its correction returned by the decoder includes a non-trivial cycle along the temporal axis. It is important to check for temporal errors, as they can cause logical errors when we perform fault-tolerant logic gates by code deformation. Threshold using a matching decoder as a function of dephasing noise bias. We show fault-tolerant thresholds with phenomenological noise (left) and code-capacity thresholds (right) compared with previous decoders [10]. (left) We compare the results obtained with the XZZX code with Z-biased noised (blue) to the threshold obtained with the CSS code with Y-biased noise (red) [10]. Equivalent results to the red points are obtained with Z-biased noise using the tailored code of Ref. [7]. The XZZX code significantly outperforms the CSS code for all noise biases. Data points are found with 3 × 10 4 Monte Carlo samples for lattices with d = 12, 14, . . . , 20 at finite bias and d = 24, 28, . . . , 40 at infinite bias. We compare our results to the threshold found using a conventional matching decoder for the CSS surface code for the phenomenological noise model where bit-flip and dephasing errors are decoded independently, given by the solid line p h.r. + p l.r. = 0.029, where ∼ 2.9% is the threshold found in Ref. [39]. (right) Thresholds for the matching decoder shown as a function of bias η with noiseless measurements. Thresholds are collected using d × d periodic lattices with d = 24, 28, . . . , 40 (d = 48, 56, . . . , 80) for finite(infinite) bias. Data points are collected with 3 × 10 4 Monte-Carlo samples. We find the XZZX surface code (blue) consistently outperforms the matching decoder for the CSS surface code [10] (red). We also find the XZZX code matches the zero-rate hashing bound (solid line) at modest biases. The phenomenological noise model is defined such that qubits experience errors with probability p per unit time. These errors may be either high-rate Pauli-Z errors that occur with probability p h.r. per unit time, or low-rate Pauli X or Pauli-Y errors each occurring with probability p l.r. per unit time. The noise bias with this phenomenological noise model is defined as η = p h.r. /(2p l.r. ). One time unit is the time it takes to make a stabilizer measurement, and we assume we can measure all the stabilizers in parallel [5]. Each stabilizer measurement returns the incorrect outcome with probability q = p h.r. + p l.r. . To leading order, this measurement error rate is consistent with a measurement circuit where an ancilla is prepared in the state |+ and subsequently entangled to the qubits of S f with bias-preserving controlled-not and controlledphase gates before its measurement in the Pauli-X basis. With such a circuit, Pauli-Y and Pauli-Z errors on the ancilla will alter the measurement outcome. At η = 1/2 this noise model interpolates to a conventional noise model where q = 2p/3 [40]. We also remark that hook errors [40,41] are low-rate events with this circuit as the control qubit of entangling gates commutes with the high-rate Pauli-Z errors, and so high-rate errors are not spread to the code.
Intuitively, the decoder will preferentially pair defects along the diagonals associated with the dominant error. In the limit of infinite bias at q = 0, the decoder corrects the Pauli-Z errors by treating the XZZX code as independent repetition codes. It follows that by extending the syndrome along the temporal direction to account for the phenomenological noise model with infinite bias, we effectively decode d decoupled copies of the two-dimensional surface code. With the minimum-weight perfect matching decoder we therefore expect a fault-tolerant threshold ∼ 10.3% [14]. In both the case of phenomenological noise, and the case with ideal measurements, when η = 1/2 the minimum-weight perfect-matching decoder is equivalent to the well-studied conventional decoder [14,39]. We use these observations to check that our decoder behaves correctly in these limits.
In Fig. 4, we present the thresholds we obtain for the phenomenological noise model as a function of the noise bias η. In the fault-tolerant case, we find our decoder tends towards a threshold of ∼ 10% as the bias becomes large. We note that the threshold error rate appears lower than the expected ∼ 10.3%; we suggest that this is a small-size effect that we also see in the ideal case with q = 0 (see discussion below). Notably, our decoder significantly surpasses the thresholds found for the CSS surface code against biased Pauli-Y errors [10]. We also compare our results to a conventional minimum-weight perfect-matching decoder for the CSS surface code where we correct bit-flip errors and dephasing errors separately. As we see, our decoder for the XZZX code is equivalent to the conventional decoding strategy at η = 1/2 and outperforms it for all other values of noise bias.
We also characterize our minimum-weight perfectmatching decoder in the ideal case where q = 0. Remarkably, our decoder closely follows the hashing bound at high bias. We observe our data drops below the hashing bound at very high bias. As in the phenomenological case, we attribute this to a small-size effect. Indeed, the success of the decoder depends on effectively decoding ∼ d independent copies of the repetition code correctly. Although the logical failure rate of a single repetition code decays exponentially, at finite system sizes and near to threshold, p < ∼ p c , the logical failure rate of each repetition code P ∼ (p/p c ) d/2 is appreciable since p/p c is close to 1. We therefore find it difficult to resolve the anticipated threshold since the logical failure rate of the entire system, ∼ dP , tends very slowly to zero due to the linear factor d. Likewise, for the phenomenological case, the success of the decoder depends on correctly decoding d independent copies of the surface code at high bias.

V. SUB-THRESHOLD SCALING
We now show that the exceptional error thresholds of the XZZX surface code are accompanied by significant advantages in terms of the scaling of the logical failure rate when error rates are below threshold. Improvements in scaling will reduce the resource overhead, because fewer physical qubits will be needed to achieve a desired logical failure rate.
For a pure dephasing noise channel, it has been shown that a tailored surface code can tolerate ∼ n/2 d dephasing errors [8], where n ∼ d 2 is the number of qubits of the code. We expect such a code to have very good logical failure rate that scales like O(p d 2 /2 h.r. ). However, for more general noise models, other mechanisms can cause logical failures. For instance, low-rate errors can align with high-rate errors along a path of length ∼ d. In which case, we would expect logical failure rate scaling is the number of lowrate errors that are needed to cause the decoder to fail.
In general, the common failure mechanisms for this code will depend on the system parameters. In what follows we identify these two distinct scaling regimes whose behavior is determined by the relative dominance of the two aforementioned failure mechanisms. We show these two regimes using the XZZX code on a rectangular lattice with dimensionality d × (d + 1) and periodic boundary conditions. With these lattice parameters, the XZZX code exemplifies a system that can tolerate up to d(d + 1)/2 dephasing errors. We focus on the simplest case with ideal measurements, i.e., where q = 0, to numerically demonstrate the logical failure rate scaling associated with each regime.
Let us examine the different failure mechanisms for the XZZX code on this lattice more carefully. Restricting to Pauli-Z errors, the weight of the only non-trivial logical operator is d(d + 1). This means the code can tolerate up to d(d + 1)/2 dephasing errors, and we can therefore expect failures due to high-rate errors to occur with probability below threshold, where N h.r. ∼ 2 d 2 is the number of configurations that can cause a failure. We compare this failure rate to the probability of a logical error caused by a string of d/4 high-rate errors and d/4 low-rate errors. We thus consider the ansatz where N l.r. ∼ 2 γd is an entropy term with 3/2 < ∼ γ < ∼ 2 [42]. We justify this ansatz and estimate γ in Sec. D. This structured noise model thus leads to two distinct regimes, depending on which failure process is dominant. In the first regime where P quad. P lin. , we expect that the logical failure rate will decay like ∼ p h.r. . We find this behavior with systems of a finite size and at high bias where error rates are near to threshold. We evaluate logical failure rates using numerical simulations to demonstrate the behavior that characterizes this regime; see Fig. 5 (left). Our data shows good agreement with the scaling ansatz P = Ae Bd 2 . In contrast, our data is not well described by a scaling P = Ae Bd .
We observe the regime where P lin. P quad. using numerics at small p and modest η. In this regime, logical errors are caused by a mixture of low-rate and high-rate errors that align along a path of weight O(d) on some non-trivial cycle. In Fig. 5 (right), we show that the data agree well with the ansatz of Eqn. (3), with γ ∼ 1.8. This remarkable correspondence to our data shows that our decoder is capable of decoding up to ∼ d/4 low-rate errors, even with a relatively large number of high-rate errors are simultaneously occurring on the lattice.
In summary, for either scaling regime, we find that there are significant implications for overheads. We emphasise that the generic case for fault-tolerant quantum computing is expected to be the regime dominated by P lin. . In this regime, the logical failure rate of a code is expected to decay as P ∼ p d/2 below threshold [5,43,44]. Under biased noise, our numerics show that failure rates P ∼ (p/ √ η) d/2 can be obtained. This additional decay factor ∼ η −d/4 in our expression for logical failure rate means we can achieve a target logical failure rate with far fewer qubits at high bias. The regime dominated by P quad. scaling is particularly relevant for near-term devices that have a small number of qubits operating near the threshold error rate. In this situation, we have demonstrated a very rapid decay in logical failure rate like ∼ p d 2 /2 at high bias, if they can tolerate ∼ d 2 /2 dephasing errors. . The data fit the former very well; for the latter, the gradients of the best fit dashed lines, as shown on the inset plot as a function of log(p/(1 − p)), give a linear slope of 0.61 (3). Because this slope exceeds the value of 0.5, we conclude that the sub-threshold scaling is not consistent with P lin. = Ae Bd . (right) Logical failure rates at modest bias far below threshold. The data (markers) were collected at bias η = 3 and coprime d × (d + 1) code dimensions of d = 5, 7, 9, 11, 13 and 15 assuming ideal measurements. Data is collected using the metropolis algorithm and splitting method presented in [45,46]. The solid lines represent the prediction of Eqn. (3). The data shows very good agreement with the single parameter fitting for all system sizes as p tends to zero.

VI. LOW-OVERHEAD FAULT-TOLERANT QUANTUM COMPUTATION
As with the CSS surface code, we can perform faulttolerant quantum computation with the XZZX code using code deformations [29][30][31][48][49][50]. Here we show how to maintain the advantages that the XZZX code demonstrates as a memory experiencing structured noise, namely, its high threshold error rates and its reduced resource costs, while performing fault-tolerant logic gates.
A code deformation is a type of fault-tolerant logic gate where we manipulate encoded information by changing the stabilizer group we measure [48,50]. These altered stabilizer measurements project the system onto a new stabilizer code where the encoded information has been transformed or 'deformed'. These deformations allow for Clifford operations with the surface code; Clifford gates are universal for quantum computation when supplemented with the noisy initialization of magic states [51]. Although initialization circuits have been proposed to exploit a bias in the noise [52], here we focus on faulttolerant Clifford operations and the fault-tolerant preparation of logical qubits in the computational basis.
Many approaches for code deformations have been proposed that, in principle, could be implemented in a way to take advantage of structured noise using a tailored surface code. These approaches include braiding punctures [48][49][50]53], lattice surgery [29,30,47,54] and com-putation with twist defects [30,55,56]. We focus on a single example based on lattice surgery as in Refs. [31,47]; see Fig. 6 (top left). We will provide a high-level overview and leave open all detailed questions of implementation and threshold estimates for fault-tolerant quantum computation to future work.
Our layout for fault-tolerant quantum computation requires the fault-tolerant initialization of a hexon surface code, i.e., a surface code with six twist defects at its boundaries; see Fig. 6 (bottom left). Note that the lattice geometry has been rotated compared with Fig. 1; we will justify this choice shortly. We can fault-tolerantly initialize this code in eigenstates of the computational basis through a process detailed in Fig. 6. We remark that the reverse operation, where we measure qubits of the XZZX surface code in this same product basis, will read the code out while respecting the properties required to be robust to the noise bias.
We briefly confirm that this method of initialization is robust to our biased noise model. Principally, this method must correct high-rate Pauli-Z errors on the red qubits, as Pauli-Z errors act trivially on the blue qubits in eigenstates of the Pauli-Z operator during preparation. Given that the initial state is already in an eigenstate of some of the stabilizers of the XZZX-surface code, we can detect these Pauli-Z errors on red qubits, see, e.g. Fig. 6(e). The shaded faces will identify defects due to the Pauli-Z errors. Moreover, as we discussed . This initialization strategy is robust to biased noise. Pauli-Z errors that can occur on red vertices are detected by the shaded faces (c). We can also detect low-rate Pauli-X errors on blue vertices with using this method of initialization, see (d). We can decode all of these initialization errors on this subset of faces using the minimum-weight perfect matching decoder in the same way we decode the XZZX code as memory. (right) The hexon surface code fused to the ancillary surface code to perform a logical Pauli-Y measurement. The lattice surgery procedure introduces a twist in the centre of the lattice. We show the symmetry with respect to the Pauli-Z errors by lightly colored faces. Again, decoding this model in the infinite bias limit is reduced to decoding one-dimensional repetition codes, except at the twist where there is a single branching point. before, strings created by Pauli-Z errors align along horizontal lines using the XZZX-surface code. This, again, is due to the stabilizers of the initial state respecting the one-dimensional symmetries of the code under pure dephasing noise. In addition to robustness against highrate errors, low-rate errors as in Fig. 6(f) can also be detected on blue qubits. The bit-flip errors violate the stabilizers we initialize when we prepare the initial product state. As such we can adapt high-threshold errorcorrection schemes we have proposed for initialization to detect these errors for the case of finite bias. We therefore benefit from the advantages of the XZZX surface code under a biased error model during initialization.
We also note that further improvements can be found with a judicious choice of boundary conditions for this lattice geometry. Specifically, because the high-rate error strings of the biased noise model align along the vertical direction only, we can change the spatial dimensions of the lattice without compromising the performance of the code significantly. In practice, at high dephasing bias, we can have that d X d Z , where d X (d Z ) denote the code distance for Pauli-X (Pauli-Z) errors. This choice may have a dramatic effects on the resource scaling of large-scale quantum computation. At high bias and low error rates, we estimate that is optimal. To see this, let us suppose that a logical failure due to high(low)-rate errors is P h.r. ≈ p d Z /2 (P l.r. ≈ (p/η) d X /2 ) where we have neglected entropy terms and assumed p h.r. ∼ p and p l.r. ∼ p/η. Equating P l.r. and P h.r. gives us Eqn. (4). Similar results have been obtained in, e.g. [16,26,[57][58][59] with other codes. Assuming then an error rate that is far below threshold, e.g. p ∼ 1%, and a reasonable bias we might expect η ∼ 100, we find an aspect ratio d X ∼ d Z /2. We also suggest some small-scale implementations of the rectangular XZZX codes might be well suited for implementation with near-term devices; see Fig. 7.
Code deformations amount to initializing and reading out different patches of a large surface code lattice. As such, performing arbitrary code deformations while preserving the biased noise protection offered by the XZZX surface code is no more complicated than what has already been demonstrated. This is with one exception. We might consider generalizations of lattice surgery or other code deformations where we can perform faulttolerant Pauli-Y measurements. In this case, we introduce a twist to the lattice [55] and, as such, we need to reexamine the symmetries of the system to propose a high-performance decoder. We show the twist in the centre of Fig. 6 (right) together with its weight-five stabilizer operator. A twist introduces a branch in the onedimensional symmetries of the XZZX surface code. A minimum-weight perfect-matching decoder can easily be adapted to account for this branch. Moreover, should we consider performing fault-tolerant Pauli-Y measurements, we do not expect that a branch on a single location on the lattice will have a significant impact on the performance of the code experiencing some structured noise. Indeed given that, even with a twist on the lattice, the majority of the lattice is decoded as a series of one-dimensional repetition codes in the infinite bias limit.

VII. DISCUSSION
We have shown how fault-tolerant quantum architectures based on the XZZX surface code yield remarkably high memory thresholds and low overhead as compared with the conventional surface code approach. Our generalized fault-tolerant decoder can realize these advantages over a broad range of biased error models representing what is observed in experiments for a variety of physical qubits.
The performance of the XZZX code is underpinned by its exceptional code-capacity thresholds, which match the performance of random coding (hashing) theory, suggesting that this code may be approaching the limits of what is possible. In contrast to this expectation, the XZZX surface code threshold is numerically observed to exceed this hashing bound for certain error models, opening the enticing possibility that random coding is not the limit for practical thresholds. We note that for both code capacities and fault-tolerant quantum computing, the highest achievable error thresholds are not yet known.
We emphasize that the full potential of our results lies not just in the demonstrated advantages of using this particular architecture, but rather the indication that further innovations in codes and architectures may still yield significant gains in thresholds and overheads. We have shown that substantial gains on thresholds can be found when the code and decoder are tailored to the relevant noise model. While the standard approach to decoding the surface code considers Pauli-X and Pauli-Z errors separately, we have shown that a tailored non-CSS code and decoder can outperform this strategy for essentially all structured error models. There is a clear avenue to generalize our methods and results to the practical setting involving correlated errors arising from more realistic noise models as we perform fault-tolerant logic. We sug-gest that the theory of symmetries [10,32] may offer a formalism to make progress in this direction.
Because our decoder is based on minimum-weight matching, there are no fundamental obstacles to adapt it to the more complex setting of circuit noise [40,49,60]. We expect that the high numerical thresholds we observe for phenomenological noise will, when adapted to circuit level noise, continue to outperform the conventional surface code, especially when using gates that preserve the structure of the noise [27,28]. We expect that the largest performance gains will be obtained by using information from a fully characterized Pauli noise model [61][62][63] that goes beyond the single-qubit error models considered here.
Along with high thresholds, the XZZX surface code architecture can yield significant reductions in the overheads for fault-tolerant quantum computing, through improvements to the sub-threshold scaling of logical error rates. It is in this direction that further research into tailored codes and decoders may provide the most significant advances, bringing down the astronomical numbers of physical qubits needed for fault-tolerant quantum computing. A key future direction of research would be to carry these improvements over to codes and architectures that promise improved (even constant) overheads [64][65][66]. Recent research on fault-tolerant quantum computing using low-density parity check (LDPC) codes that generalize concepts from the surface code [67][68][69][70][71][72][73] provide a natural starting point.

ACKNOWLEDGMENTS
We are grateful to A. Darmawan, A. Grimsmo and S. Puri for discussions, to E. Campbell for comments on an earlier draft, and especially to J. Wootton for recommending consideration of the XZZX code for biased noise [74]. This work is supported by the Australian Research Council via the Centre of Excellence in Engineered Quantum Systems (EQUS) project number CE170100009. BJB also received support from the University of Sydney Fellowship Programme. Access to highperformance computing resources was provided by the National Computational Infrastructure (NCI Australia), an NCRIS enabled capability supported by the Australian Government, and the Sydney Informatics Hub, a Core Research Facility of the University of Sydney.

Appendix A: Threshold estimation
For the estimation of a threshold error rate p c , we use the critical exponent method of Ref. [39]. According to this method, if we define a correlation length ξ = (p − p c ) −ν in terms of physical error rate p and some critical exponent ν, then, for sufficiently large code distance d, we expect the logical failure rate f to depend only on the dimensionless ratio d/ξ. Consequently, we We use a quadratic model, f = A + Bx + Cx 2 , and then fit to this model to find p c , ν and the nuisance parameters A, B, C. See Ref. [75] for a discussion of the validity of this method. Unless otherwise stated, error bars are obtained by jackknife resampling, i.e. the standard deviation in threshold estimates when reapplying the fitting method with a single code distance d removed, over all simulated code distances d. As a representative illustration of this method, we estimate the optimal threshold error rate p c of the XZZX surface code under Xbiased noise with η = 100, assuming ideal measurements in Fig. 8. Bravyi, Suchara, and Vargo [36] efficiently approximates maximum-likelihood decoding by mapping coset probabilities to tensor-network contractions. Contractions are approximated by reducing the size of the tensors during contraction through Schmidt decomposition and retention of only the χ largest Schmidt values. This approach, appropriately adapted, has been found to converge well with modest values of χ for a range of Pauli noise channels and surface code layouts [8,36]. A full description of the tensor network used in our simulations with the rotated CSS surface code (see Fig. 9, right) is provided in Ref. [8]; adaptation to the XZZX surface code (see Fig. 9, left) is a straightforward redefinition of tensor element values for the uniform stabilizers. Figure 2, which shows threshold values over all singlequbit Pauli noise channels for CSS and XZZX surface codes, is constructed as follows. Each threshold surface is formed using Delaunay triangulation of 211 threshold values. Since both CSS and XZZX surface codes are symmetric in the exchange of Pauli X and Z, 111 threshold values are estimated for each surface. Sample noise channels are distributed radially such that the spacing reduces quadratically towards the sides of the triangle representing all single-qubit Pauli noise chan- Tensor-network decoder convergence for the 77×77 XZZX surface code with Z-biased noise, represented by shifted logical failure rate fχ − f24, as a function of χ at a physical error probability p near the hashing bound for the given bias η. Each data point corresponds to 30 000 runs with identical errors generated across all χ for a given bias.
nels, see Fig. 10. Each threshold is estimated over four code distances d ∈ {13, 17, 21, 25}, at least six physical error probabilities, and 30 000 simulations per code distance and physical error probability. In all simulations, a tensor-network decoder approximation parameter of χ = 16 is used to achieve reasonable convergence over all sampled single-qubit Pauli noise channels for the given code sizes. Figure 3, which investigates threshold estimates exceeding the hashing bound for the XZZX surface code with Z-biased noise, is constructed as follows. For bias 30 ≤ η ≤ 1000, where XZZX threshold estimates exceed the hashing bound, we run compute-intensive simulations; each threshold is estimated over sets of four code distances up to d ∈ {65, 69, 73, 77}, at least fifteen physical error probabilities, and 60 000 simulations per code distance and physical error probability. Interestingly, for the XZZX surface code with Z-biased noise, we find the tensor-network decoder converges extremely well, as summarized in Fig. 11 for code distance d = 77, allowing us to use χ = 8. For η = 30, the shift in logical failure rate between χ = 8 and the largest χ shown is less than one fifth of a standard deviation over 30 000 simulations, and for η > 30 the convergence is complete. All other threshold estimates in Fig. 3, are included for context and use the same simulation parameters as described above for Fig. 2.
Appendix C: The minimum-weight perfect-matching decoder Decoders based on the minimum-weight perfectmatching algorithm [37,38] are ubiquitous in the quantum error-correction literature [5,14,32,39,76]. The efficient algorithm returns a perfect matching of an input graph with weighted edges such that the sum of the weights of the edges of the matching is minimal.
We use this algorithm for decoding by assigning the defects of the syndrome vertices of the input graph. We then assign weights to the edges of the graph such that the matching gives us information on how to locally correct the defects and recover the encoded state. The success of the decoder depends on how we choose to weight the edges of the input graph. Here we discuss how we assign weights to the edges of the graph.
It is convenient to define a new coordinate system that follows the symmetries of the code. Denote by f ∈ D j sets of faces aligned along a diagonal line such that S = f ∈Dj S f is a symmetry of the code with respect to Pauli-Z errors, i.e., S commutes with Pauli-Z errors. One such diagonal is shown in Fig. 1(c). Let also D j be the diagonal sets of faces that respect symmetries introduced by Pauli-X errors.
Let us first consider the decoder at infinite bias. We find that we can decode the lattice as a series of onedimensional matching problems along the diagonals D j at infinite bias. Any error drawn from the set of Pauli-Z errors E Z must create an even number of defects along diagonals D j . Indeed, S = f ∈Dj S f is a symmetry with respect to E Z since operators S commute with errors E Z . In fact, this special case of matching along a onedimensional line is equivalent to decoding the repetition code using a majority vote rule. As an aside, it is worth mentioning that the parallelized decoding procedure we have described vastly improves the speed of decoding in this infinite bias limit.
We next consider a finite-bias error model where qubits experience errors with probability p. Pauli-Z errors occur at a higher rate, p h.r. , and Pauli-X and Pauli-Y errors both occur at the same low rate p l.r. as defined in the main text. At finite bias, string-like errors can now extend in all directions along the two-dimensional lattice. Again, we use minimum-weight perfect matching to find a correction by pairing nearby defects with the string operators that correspond to errors that are likely to have created the defect pair.
We decode by giving a complete graph to the minimum-weight perfect-matching algorithm where each pair of defects u and v are connected by an edge of weight ∼ − log prob(E s ), where prob(E u,v ) is the probability that the most probable string E u,v created defects u and v. It remains to evaluate − log prob(E u,v ).  Fig. 13. Gradient of line of best fit to these data points is 0.503(4) in agreement with the expected gradient 1/2.
For the uncorrelated noise models we consider, − log prob(E u,v ) depends, anisotropically, on the separation of u and v. We define orthogonal axes x (y ) that align along (run orthogonal to) the diagonal line that follows the faces of D j . We can then define separation between u and v along axes x and y using the Manhattan distance with integers l x and l y , respectively. On large lattices then, we choose − log prob(E u,v ) ∝ w h.r. l x + w l.r. l y where w l.r. = − log p l.r. 1 − p , w h.r. = − log p h.r.
The edges returned from the minimum-weight perfect matching algorithm [37,38] indicate which pairs of defects should be paired. We note that, for small, rectangular lattices with periodic boundary conditions, it may be that the most probable string E u,v is caused by a large number of high-rate errors that create a string that wraps around the torus. It is important that our decoder checks for such strings to achieve the logical failure rate scaling like O(p d 2 /2 h.r. ). We circumvent the computation of the weight between two defects in every simulation by creating a look-up table from which the required weights can be efficiently retrieved. Moreover, we minimize memory usage by taking advantage of the translational invariance of the lattice.
We finally remark that our minimum-weight perfectmatching decoder naturally extends to the fault tolerant regime by weighting edges connecting pairs of defects in the 2 + 1-dimensional syndrome history such that − log prob(E u,v ) ∝ l x w h.r. + l y w l.r + l t w t , where now we have l t the separation of u and v along the time axis, w t = − log q 1−q and q = p h.r. + p l.r. . In the limit that η = 1/2 our decoder is equivalent to the conventional minimum-weight perfect-matching decoder for phenomenological noise [39].
In Fig. 13 we plot the data shown in the main text in Fig. 5(right) as a function of d to read the gradient G(p, η) from the graph. We then plot G(p, η) as a function of β = log[p/(1 − p)] in the inset of Fig. 13. The plot reveals a gradient ∼ 0.5, consistent with our ansatz where we expect a gradient of 1/2. Furthermore, at p = 0 we define the restricted function I(η) ≡ G(p=0, η) = γ log 2 + 1 4 log η + 1/2 (η + 1) 2 . (D3) We estimate I(η) from the extrapolated p = 0 intercepts of our plots, such as shown in the inset of Fig. 13, and present these intercepts a function of log[(η + 1/2)/(η + 1) 2 ]; see Fig. 14. We find a line of best fit with gradient 0.22 ± 0.03, which agrees with the expected value of 1/4. Moreover, from the intercept of this fit, we estimate γ = 1.8 ± 0.06, which is consistent with 3/2 ≤ γ ≤ 2 that we expect [42]. Thus, our data are consistent with our ansatz, that typical error configurations lead to logical failure with ∼ d/4 low-rate errors.