Neural network-based decoding for bias-tailored quantum codes over quantum channels with asymmetric noise

Jihao Fan; Qianhui Zhang; Zhihua Zhang; Jun Li

doi:10.1088/1572-9494/ade49c

Communications in Theoretical Physics >

2025 , Vol. 77 >Issue 12: 125101

DOI: https://doi.org/10.1088/1572-9494/ade49c

Quantum Physics and Quantum Information

Neural network-based decoding for bias-tailored quantum codes over quantum channels with asymmetric noise

Jihao Fan ^,¹^,²^,^* ,
Qianhui Zhang ¹ ,
Zhihua Zhang ¹ ,
Jun Li ³

Expand

¹School of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
² Laboratory for Advanced Computing and Intelligence Engineering, Wuxi 214083, China
³School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

*Author to whom any correspondence should be addressed.

Received date: 2024-12-21

Revised date: 2025-06-01

Accepted date: 2025-06-03

Online published: 2025-08-04

Copyright

This article is available under the terms of the IOP-Standard License.

Fold

Abstract

To improve the decoding performance of quantum error-correcting codes in asymmetric noise channels, a neural network-based decoding algorithm for bias-tailored quantum codes is proposed. The algorithm consists of a biased noise model, a neural belief propagation decoder, a convolutional optimization layer, and a multi-objective loss function. The biased noise model simulates asymmetric error generation, providing a training dataset for decoding. The neural network, leveraging dynamic weight learning and a multi-objective loss function, mitigates error degeneracy. Additionally, the convolutional optimization layer enhances early-stage convergence efficiency. Numerical results show that for bias-tailored quantum codes, our decoder performs much better than the belief propagation (BP) with ordered statistics decoding (BP + OSD). Our decoder achieves an order of magnitude improvement in the error suppression compared to higher-order BP + OSD. Furthermore, the decoding threshold of our decoder for surface codes reaches a high threshold of 20%.

Key words： quantum error correction; quantum channels; asymmetric noise; neural network-based decoding; bias-tailored quantum codes

Cite this article

Jihao Fan , Qianhui Zhang , Zhihua Zhang , Jun Li . Neural network-based decoding for bias-tailored quantum codes over quantum channels with asymmetric noise[J]. Communications in Theoretical Physics, 2025 , 77(12) : 125101 . DOI: 10.1088/1572-9494/ade49c

1. Introduction

Currently, quantum systems promise faster computing, more secure communication, and more precise sensing [1–3]. However, all quantum systems face a fundamental problem: quantum bits (qubits) are highly prone to errors. Therefore, it is essential to incorporate quantum error-correcting codes (QECC) into fault-tolerant quantum systems [4, 5]. QECCs typically assume that qubits are subject to depolarizing noise, where the bit-flip (X) errors, the phase-flip (Z) errors, and the bit/phase (Y) Pauli errors occur with equal probabilities. However, in reality, physical qubits are often affected by biased noise, where one type of error is more likely to occur than others. For instance, phase noise predominates in superconducting qubit architectures, as discussed in the literature [6, 7]. As a result, bias-tailored QECCs [8, 9] have been designed to exploit the asymmetry in noise to improve the error correction performance of quantum codes. Research by Bonilla Ataides et al [8] has demonstrated that a variant of the surface code, named XZZX code, achieves significant performance under biased noise conditions. In a recent study by Joschka Roffe et al [9], a bias-tailored lifted product code structure was introduced, providing a framework for extending bias-tailored methods beyond two-dimensional topological codes. They presented examples of bias-tailored lifted product codes based on classical quasi-cyclic codes and conducted numerical evaluations of their performance using belief propagation (BP) with ordered statistics decoding (BP + OSD).

In the decoding algorithms for sparse graph codes, the belief propagation algorithm exhibits large advantages in low decoding overhead and excellent decoding performance [10, 11]. Quantum low-density parity-check (QLDPC) codes can also use BP algorithms for decoding. It should be noted that due to the presence of quantum short cycles and quantum degeneracy [12, 13] in QLDPC codes, traditional BP algorithms often fail to decode successfully. In recent years, various improvements to BP algorithms have been proposed, such as BP with additional memory effects (MBP) [12], BP with posterior adjustment [14], BP + OSD [15, 16], minimum-weight perfect matching (MWPM) [17–20], and neural BP (NBP) [21–23]. However, most of these improvements are designed for symmetric noise decoding systems. Nowadays, the development of machine learning (ML) [24–26] is very rapid, and NBP is constrained by the code length in this problem. The rapid advancement and iteration of GPU technology are expected to significantly improve its scalability and efficiency. We chose to build a neural network decoder, which is a major trend in ML development [27–29], even though it has a complex and time-consuming training process and tends to have a linear increase in computational complexity [30] compared to traditional BP. However, the performance of a trained decoder is often unsurpassed by other methods [31–33]. The main objective of this paper is also to demonstrate the advantages of neural networks in the field of bias-tailored quantum codes.

In this paper, we design a new neural network-based decoding algorithm, NBP + CNN + BIAS, for biased quantum codes to effectively handle quantum information under biased noise. The name reflects the core components of our approach: NBP as the primary decoding framework, convolutional neural network (CNN) to accelerate convergence, and bias-tailored optimization to adapt to asymmetric noise. This algorithm leverages the asymmetry of the noise to enhance the error correction performance of QECCs. The contributions of this paper are as follows:

•A biased noise model is developed to generate asymmetric noise vectors, which serve as the input dataset for the neural network.

•A neural network decoding algorithm based on the belief propagation framework is proposed. By utilizing a customized multi-objective loss function, the algorithm performs back error correction and mitigates the issue of quantum error degeneracy during quantum information transmission. The decoding threshold on surface codes reaches 20%.

•Several convolutional optimization schemes are discussed, which accelerate the convergence speed of the proposed algorithm.

•Numerical results show that for bias-tailored quantum codes, NBP + CNN + BIAS performs better than BP + OSD. Our proposed NBP + CNN + BIAS decoder achieves an order of magnitude improvement in their error suppression relative to higher-order BP + OSD.

2. Preliminaries

2.1. Calderbank-shor-steane codes

The Calderbank-Shor-Steane (CSS) code family is a special type of QECCs that has independent X-stabilizers and Z-stabilizers so that each stabilizer is independently composed of X-type Pauli operators and Z-type Pauli operators [34]. A CSS code can be constructed by the parity-check matrices H_X and H_Z corresponding to the two classical codes C_X and C_Z, which can be expressed in the following matrix form:

(1)$\begin{eqnarray}{H}_{{\rm{CSS}}}=\left[\left.\begin{array}{c}0\\ {H}_{X}\end{array}\right|\begin{array}{c}{H}_{Z}\\ 0\end{array}\right],\end{eqnarray}$

where H_CSS fulfills the symplectic criterion ${H}_{Z}\cdot {H}_{X}^{{\rm{T}}}=0$, and T is a transposition operator.

**2.2. Bias-tailored XZZX codes**

The hypergraph product code is a special method for constructing CSS codes that allows the construction of quantum codes from any two classical codes [35]. Let C₁ = [n₁, k₁, d₁] and C₂ = [n₂, k₂, d₂] be two linear codes whose parity-check matrices are given as H₁ and H₂, respectively. Then, the H_X and H_Z of the hypergraph product (HGP) code are given as ${H}_{X}=\left({H}_{1}\otimes {{\mathbb{1}}}_{{n}_{2}}| {{\mathbb{1}}}_{{m}_{1}}\otimes {{H}_{2}}^{T}\right)$ and ${H}_{Z}=\left({{\mathbb{1}}}_{{n}_{1}}\otimes {H}_{2}| {{H}_{1}}^{T}\otimes {{\mathbb{1}}}_{{m}_{2}}\right)$, respectively. Based on the preceding discussion, the structure of the H_CSS matrix can be expressed as follows:

(2)$\begin{eqnarray}{H}_{{\rm{CSS}}}=\left[\left.\begin{array}{c}0\\ {H}_{X}\end{array}\right|\begin{array}{c}{H}_{Z}\\ 0\end{array}\right]=\left[\left.\begin{array}{cc}0 & 0\\ {H}_{X1} & {H}_{X2}\end{array}\right|\begin{array}{cc}{H}_{Z1} & {H}_{Z2}\\ 0 & 0\end{array}\right],\end{eqnarray}$

where we denote by H_X = [H_X1, H_X2] and H_Z = [H_Z1, H_Z2], respectively. Corresponding to the HGP code structure, these submatrices are defined as follows: ${H}_{X1}={H}_{1}\otimes {{\mathbb{1}}}_{{n}_{2}}$, ${H}_{X2}={{\mathbb{1}}}_{{m}_{1}}\otimes {{H}_{2}}^{T}$, ${H}_{Z1}={{\mathbb{1}}}_{{n}_{1}}\otimes {H}_{2}$, and ${H}_{Z2}={{H}_{1}}^{T}\otimes {{\mathbb{1}}}_{{m}_{2}}$.

We apply the Hadamard rotation operation to the H_CSS matrix [9], yielding the matrix H_HR as follows:

(3)$\begin{eqnarray}{H}_{{\rm{HR}}}=\left[\left.\begin{array}{cc}0 & {H}_{Z2}\\ {H}_{X1} & 0\end{array}\right|\begin{array}{cc}{H}_{Z1} & 0\\ 0 & {H}_{X2}\end{array}\right].\end{eqnarray}$

At this stage, we have obtained the parity-check matrix H_HR for the bias-tailored XZZX code [9]. Here, we take the HGP code as an example. In fact, as long as the parity-check matrix of a quantum code can be decomposed into four submatrices, the Hadamard rotation transformation can be applied to construct a bias-tailored code. Then, we refer to equation (2) as the parity-check matrix of a QECC in the standard CSS form and equation (3) as the parity-check matrix of a bias-tailored QECC in the XZZX form.

It is worth noting that the method of equations (2) and (3) achieves an improvement in the decoding effect only in the form of specific constructions [9, 36]. An example is the HGP construction. After Hadamard rotation, the symmetric structure of the HGP code is destroyed, and the H_X matrix of the HGP code is completely dependent on the H₁ matrix. In this case, if H₁ has better parameters compared to H_CSS. Then, the enhancement of its decoding effect is obvious under the bias condition favoring H_X. We will discuss this further in the experimental section.

3. Neural belief propagation decoder

According to figure 1, the specific process of the decoder is as follows:

View original graphic|Download|PPT slide

Figure 1. The structure of NBP + BIAS decoder. The decoder mainly consists of a biased noise model and a neural belief propagation (NBP) decoder using a multi-objective loss function.

•The quantum channel generates simulated asymmetric noise, which is received by the quantum biased noise model. This model produces the rotated asymmetric noise errors X and Z required by the decoder and applies them to the H_Z and H_X matrices (see equations (3) and (4)). This step generates the syndrome s.

•The input information to NBP consists of equation (7) and the syndrome s obtained from the aforementioned steps. During the BP process, the neural network incorporates weight parameters w_i to soften the edge weights, which are adjusted through a multi-objective loss function. The output of the neural network is the marginalized probability information given by equation (10). This step generates the marginal probability λ_v.

•The marginalized probability λ_v is used to infer errors through the equation

( )$\begin{eqnarray*}\hat{{\boldsymbol{e}}}=\,\rm{sigmoid}\,({\lambda }_{v})=\frac{1}{1+{{\rm{e}}}^{-{\lambda }_{v}}},\end{eqnarray*}$

where e is the natural constant. If these values fall within the range [0, 0.5), the BP hard decisions assign $\widehat{{\boldsymbol{e}}}=0$, while these values fall within [0.5, 1], the BP hard decisions assign $\widehat{{\boldsymbol{e}}}=1$.

•Verify whether the results $\widehat{{\boldsymbol{e}}}$ satisfy $({\boldsymbol{e}}+\widehat{{\boldsymbol{e}}})\cdot {{\boldsymbol{H}}}_{i}^{\perp }=0$. If the condition is not met, logical errors have occurred.

•Additionally, the convolution operation mentioned in section 4 can accelerate the convergence of the NBP algorithm. Specifically, this is achieved by incorporating convolutional layers into the BP process, as illustrated in figure 2.

View original graphic|Download|PPT slide

Figure 2. Convolutional optimization schemes. The symbol ‘⋯' represents a repeating structure and ‘T' is the number of repetitions.

3.1. Biased noise model

The characteristics of quantum codes require that their decoding methods depend on their syndrome [37, 38]. First, the syndrome s of the XZZX code is calculated by applying the Hadamard rotation stabilizers to the error vector e. Denote HR(x) by the Hadamard matrix applied from equation (3) to the vector x, i.e.

(4)$\begin{eqnarray}\begin{array}{rcl}\left({{\boldsymbol{s}}}_{X}| {{\boldsymbol{s}}}_{Z}\right) & = & \,\rm{HR}\,({\boldsymbol{e}})={H}_{{\rm{HR}}}\cdot \left({{\boldsymbol{e}}}_{X1}{{\boldsymbol{e}}}_{X2}| {{\boldsymbol{e}}}_{Z1}{{\boldsymbol{e}}}_{Z2}\right)\\ & = & \left({H}_{Z1}\cdot {{\boldsymbol{e}}}_{X1}+{H}_{Z2}\cdot {{\boldsymbol{e}}}_{Z2}| {H}_{X1}\cdot {{\boldsymbol{e}}}_{Z1}+{H}_{X2}\cdot {{\boldsymbol{e}}}_{X2}\right).\end{array}\end{eqnarray}$

Then, the equation (4) decouples into two problems,

(5)$\begin{eqnarray}\begin{array}{rcl}{{\boldsymbol{s}}}_{X} & = & {H}_{Z}\cdot \left({{\boldsymbol{e}}}_{X1}| {{\boldsymbol{e}}}_{Z2}\right),\\ {{\boldsymbol{s}}}_{Z} & = & {H}_{X}\cdot \left({{\boldsymbol{e}}}_{Z1}| {{\boldsymbol{e}}}_{X2}\right).\end{array}\end{eqnarray}$

Algorithm 1.Biased noise model

Input: CSS parity check matrices H_X and H_Z; code length N; column length N1 of H_X1; Pauli error rates ε_X, ε_Y, ε_Z

Output: errors e_X and e_Z

1: e_X = 0, e_Z = 0

2: p_X = 0, p_Z = 0

3: Hadamard rotation

4: for i = 1 to N1

5: p_X[i] = ε_X + ε_Y

6: p_Z[i] = ε_Z + ε_Y

7: end for

8: for i = N1 + 1 to N

9: p_X[i] = ε_Z + ε_Y

10: p_Z[i] = ε_X + ε_Y

11: end for

12: generate random errors

13: for i = 1 to N

14: ξ = random(0, 1)

15: if ξ < p_X[i]

16: e_X[i] = 1, e_Z[i] = 0

17: else if p_X[i] < ξ < p_X[i] + p_Z[i]

18: e_X[i] = 0, e_Z[i] = 1

19: else if p_X[i] + p_Z[i] < ξ < p_X[i] + p_Z[i] + ε_Y

20: e_X[i] = 1, e_Z[i] = 1

21: end if

22: end for

23: return e_X, e_Z

In equations (4) and (5), the H_X and H_Z matrices are the parity-check matrices of the standard CSS quantum code before the Hadamard rotation operation. This step demonstrates that, for simulation purposes, it is sufficient to simply apply the Hadamard rotation operation to the error vector while retaining the original CSS form of the stabilizer matrices. This operation is illustrated in the yellow box in figure 1.

We initialize the error distribution by considering an error model in which each qubit is independently affected by Pauli noise. If ε_X, ε_Y, and ε_Z are the physical error rates for X, Y, and Z errors, respectively, then the total physical error rate is ε = ε_X + ε_Y + ε_Z. Under this condition, the error bias is ${B}_{i}=\frac{{\epsilon }_{i}}{{\sum }_{j\ne i}{\epsilon }_{j}}$, where ε_i, ε_j ∈ {ε_X, ε_Y, ε_Z}. In general, the symmetric channel has B_X = 0.5, i.e. ε_X = ε_Y = ε_Z. The specific details of the noise model corresponding to this section can be found in algorithm 1.

3.2. Neural belief propagation

Neural networks soften the information during the decoding process, effectively avoiding the quantum degeneracy problem often encountered in quantum decoding. For highly degenerate quantum codes, there are many low-weight stabilizers corresponding to local minima in the energy topology, making traditional BP easily trapped in these local minima near the origin. Weight updates in neural networks can enhance the network's perception ability of network by adjusting the step size, allowing the energy minimization process to converge. Belief propagation is an iterative algorithm used to approximate the mean of each variable node e_v.

Note that the weight parameter setting follows the rule: it is used to construct some kind of weight input that removes the influence of its own elements.

For example, for the matrix

(6)$\begin{eqnarray}\left[\begin{array}{ccc}1 & 1 & 0\\ 0 & 1 & 1\end{array}\right].\end{eqnarray}$

Construct the weight matrix w:

•At position (0,0), the matrix entry is 1. We duplicate the row [1,1,0], set the 0th element to 0, yielding [0,1,0], and store it as the 0th column of w.

•At position (0,1), the entry is 1. We duplicate the row [1,1,0], set the 1st element to 0, yielding [1,0,0], and store it as the 1st column of w.

•At position (0,2), the entry is 0. This entry is skipped.

•At position (1,0), the entry is 0. This entry is skipped.

•At position (1,1), the entry is 1. We duplicate the row [0,1,1], set the 1st element to 0, yielding [0,0,1], and store it as the 2nd column of w.

•At position (1,2), the entry is 1. We duplicate the row [0,1,1], set the 2nd element to 0, yielding [0,1,0], and store it as the 3rd column of w.

Then, steps 1–4 below provide a structured overview of the process for enhancing the BP algorithm with a neural network. The original BP algorithm is extended by incorporating additional trainable weights w_i, enabling adaptive optimization. The neural network updates these weights through a loss function, adjusting their values after each complete training cycle to improve decoding performance.

Step 1: Initialize the variable nodes

(7)$\begin{eqnarray}{\lambda }_{v}^{(1)}={w}_{in}{l}_{v}={w}_{in}\,{\mathrm{log}}\,\frac{P({e}_{v}=0)}{P({e}_{v}=1)}={w}_{in}\,{\mathrm{ln}}\,\frac{1-\epsilon }{\epsilon },\end{eqnarray}$

where ε represents the estimated physical error probability of the channel, l_v denotes the prior log-likelihood ratio (LLR) information of the variable e_v, and w_in is the weight of the initialized input information.

Step 2: Obtain the messages passed from the variable nodes to the check nodes

(8)$\begin{eqnarray}{\lambda }_{v\to c}^{(t+1)}=\tanh \left(\displaystyle \frac{1}{2}\left({w}_{lv}^{(t)}{l}_{v}+\displaystyle \sum _{c^{\prime} \in { \mathcal N }(v)\backslash c}{w}_{cv}^{(t)}{\lambda }_{c^{\prime} \to v}^{(t)}\right)\right),\end{eqnarray}$

where ${\lambda }_{c\to v}^{(t)}$ represents the message passed from the check node to the variable node in the t-th iteration of the BP process, and ${ \mathcal N }(x)\backslash y$ denotes the set of all neighboring nodes of x except y. Additionally, ${w}_{lv}^{(t)}$ denotes the weight of the LLR information in the t-th iteration, while ${w}_{cv}^{(t)}$ represents the weight of the message passed from the check node to the variable node in the t-th iteration of BP.

Step 3: Obtain the messages passed from the check nodes to the variable nodes

(9)$\begin{eqnarray}{\lambda }_{c\to v}^{(t+1)}={(-1)}^{{s}_{c}}2{\tanh }^{-1}\displaystyle \prod _{v^{\prime} \in { \mathcal N }(c)\backslash v}{w}_{vc}^{(t)}{\lambda }_{v^{\prime} \to c}^{(t)},\end{eqnarray}$

where s_c is the error syndrome passed to the check node, and its calculation method is given in equations (4, 5). Additionally,${w}_{vc}^{(t)}$ represents the weight of the message passed from the variable node to the check node in the t-th iteration of the BP process.

Steps 2 and 3 are iteratively performed until the number of iterations reaches T.

Step 4: Marginalization

(10)$\begin{eqnarray}{\lambda }_{v}={w}_{lv}^{(\,T\,)}\left({l}_{v}\right)+\displaystyle \sum _{c\in { \mathcal N }(v)}{w}_{\mathrm{out}}{\lambda }_{c\to v}^{(\,T\,)},\end{eqnarray}$

where T represents the maximum number of BP iterations set, and w_out represents the weight of the output layer.

Note that we intentionally separate $\tanh $ and ${\tanh }^{-1}$ into the variable layer and check layer, respectively, to avoid the gradient vanishing problem during the neural network's information transmission process. This separation ensures smoother message passing and facilitates the insertion of convolutional layers in subsequent convolutional optimization steps.

3.3. Multi-objective loss function

There are three types of decoding failures in quantum error correction [12, 22]. Denote the parameters of a QECC by [[N, K, D]]. We mainly consider two types of errors. The first type is a classical error, which occurs when the inferred error $\hat{e}$ is inconsistent with the actual error e, i.e. $\hat{e}\ne e$. The second type is a quantum logical error, defined by the condition that the inferred error $\hat{e}$ does not satisfy $({\boldsymbol{e}}+\widehat{{\boldsymbol{e}}})\cdot {{\boldsymbol{H}}}_{i}^{\perp }=0$, where $HM{({H}^{\perp })}^{{\rm{T}}}=0\,\mathrm{mod}\,2$, $M=\left[\begin{array}{cc}0 & {I}_{N}\\ {I}_{N} & 0\end{array}\right]$, and H^⊥ is referred to as the symplectic dual of H with respect to the symplectic inner product M. Note that H includes both H_X and H_Z, meaning H is a 2(N − K) × 2N binary matrix. In quantum error correction, inferred errors that don't satisfy classical error conditions can still meet logical error conditions.

Traditional BP is limited by the difficulty of accurately locating the target error from a given syndrome, and instead reduces the error rate by utilizing a second error type [12]. Neural networks, on the other hand, have an advantage in solving the first type of error by softening the information in the transmission process and avoiding falling into local minima. Due to the unique error patterns of quantum codes, relying solely on the cross-entropy loss from traditional neural networks is insufficient. Unlike the NBP method in [21], the NBP + BIAS employs a multi-objective strategy that combines cross-entropy loss with logical error loss. The multi-objective loss function is $L=a{{ \mathcal L }}_{c}({\boldsymbol{\lambda }};{\boldsymbol{e}})+(1-a){{ \mathcal L }}_{l}({\boldsymbol{\lambda }};{\boldsymbol{e}})$, where the hyperparameter a is used to control the weight of the two losses. For the classical errors, we have

(11)$\begin{eqnarray}{{ \mathcal L }}_{c}({\boldsymbol{\lambda }};{\boldsymbol{e}})=-\frac{1}{N}\displaystyle \sum _{i=1}^{N}{e}_{i}\,{\mathrm{log}}\,{\hat{e}}_{i}+\left(1-{e}_{i}\right)\,{\mathrm{log}}\,\left(1-{\hat{e}}_{i}\right).\end{eqnarray}$

For the logical errors, we have

(12)$\begin{eqnarray}{{ \mathcal L }}_{l}({\boldsymbol{\lambda }};{\boldsymbol{e}})=\frac{1}{K}\displaystyle \sum _{j=1}^{K}\displaystyle \sum _{i=1}^{N}\left({\mathrm{OPT}}_{ji}f\left({e}_{i}+{\hat{e}}_{i}\right)\right).\end{eqnarray}$

The function $f(x)=\left|\sin \frac{\pi x}{2}\right|$ is used to convert discrete integer mod 2 operations into continuous mod 2 operations. $\hat{e}=\,\rm{sigmoid}\,({\lambda }_{v})=\frac{1}{1+{{\rm{e}}}^{-{\lambda }_{v}}}$, where e is the natural constant. T denotes the transpose operation. The calculation of λ_v is given by equation (10). OPT represents the logical operator, which is typically computed in the CSS form for the logical operator L_Z. It should be ensured that L_Z is not in the row space of H_Z and is within the null space of H_X and similarly for L_X. Additionally, it is necessary to ensure that L_X and L_Z anticommute.

4. Convolutional optimization schemes

CNNs have demonstrated successful applications in low-level tasks such as image denoising and super-resolution, where the primary objective is to recover the original image pixels from noisy observations. This principle shares similarities with channel decoding, inspiring this study to explore whether this capability can be transferred to the decoding task of QECCs. In this work, we investigate the role of convolutional layers in NBP through extensive experiments. These layers assist in decoding by extracting local features, particularly during the early stages, helping the model identify local error patterns. Combined with the information update rules in recursive NBP, convolutional layers can smooth and enhance the update of local confidence values, thereby accelerating convergence and reducing the bit error rate. Several proposed schemes are shown in figure 2. In these schemes, each convolutional step consists of two layers: the first employs a 3 × 3 convolutional kernel with 16 filters, followed by a 1 × 1 convolutional kernel with 8 filters. Both layers utilize the ReLU activation function. The structure of this convolutional layer is illustrated in figure 3.

View original graphic|Download|PPT slide

Figure 3. Structure of convolutional step.

5. Numerical results

The dataset for neural network training is randomly generated by an asymmetric noise channel, as shown in algorithm 1. Note that for numerical results without specified B_i values, the default is B_i = 0.5.

5.1. Experimental environment and parameter definitions

For convenience, the decoder proposed in this paper is denoted as NBP+BIAS, and its convolutional optimization scheme is referred to as the NBP + CNN + BIAS scheme. The experiments in this paper were conducted on an NVIDIA RTX 3060 Laptop GPU. The NBP + CNN + BIAS decoder is implemented based on the TensorFlow framework, utilizing the Adam optimizer to minimize the loss function. The experimental setup is shown in table 1.

Table 1. Experimental setup.

Parameter	value
learning rate	0.0001
number of hidden layers	20
number of BP iterations	10
total number of batches	40000
number of batches per epoch	1000
physical error rate ε	0.01 ∼ 0.26
batch size for each physical error rate	30
error bias B_i	{0.5, 5, 10, 30, 50}
convolutional layer parameters	{3 × 3, 16;1 × 1, 8}
batch size for validation set	50

In a multi-objective loss function, the choice of hyperparameters is a complex issue. Table 2 provides a reasonable explanation for the value of the hyperparameter selected in this paper. Therefore, we choose the value of the hyperparameter a = 0.3.

Table 2. Statistics of classical error rate and logical error rate for bias-tailored quantum code [[12, 2, 3]] under different hyperparameters a. The physical error rate ε is 0.06.

The value of a	classic error rate	logical error rate
0.1	0.020292	0.016583
0.2	0.018125	0.014854
0.3	0.015646	0.011083
0.4	0.015875	0.012583
0.5	0.015854	0.013354
0.8	0.018937	0.015229
1.0	0.020958	0.020667

5.2. Decoder performance optimization

As an example, consider scheme 1 in figure 2, which is applied to the quantum code [[58,16,3]]. This code is constructed via the hypergraph product of the [7,4,3] code. We examine the spatial structure of the convolutional model in figure 4 and the weight distribution histogram in figure 5. This convolutional model, composed of 3 × 3 and 1 × 1 convolutional kernels, demonstrates significant advantages in the NBP + CNN + BIAS decoding of QECCs.

View original graphic|Download|PPT slide

Figure 4. Visualization of the spatial structure of convolutional kernels. All the data results are obtained with a physical error rate of ε = 0.1.

View original graphic|Download|PPT slide

Figure 5. Histogram of the weight distribution of convolutional kernels. All the data results are obtained with a physical error rate of ε = 0.1.

As shown in figure 4, the 3 × 3 convolutional kernel consists of 16 filters with diverse weight distributions, effectively capturing local spatial information such as local error patterns and noise structures in QECCs. This enhances the ability to locate and correct errors. The 1 × 1 convolutional kernel, with 8 filters, performs feature fusion and optimization across channels, integrating local features and highlighting critical information, which further improves the accuracy and efficiency of message passing.

From the weight distribution in figure 5, the weights of the 3 × 3 convolutional kernel are concentrated and smooth, reflecting its stability and robustness in BP decoding, effectively suppressing gradient instabilities caused by noise. In contrast, the 1 × 1 convolutional kernel exhibits a wider weight distribution, indicating stronger capabilities in global information representation and adjustment, ensuring the efficiency and accuracy of iterative updates.

Then, we analyze four partially iterative convolutional schemes and three fully iterative convolutional schemes, with their framework details illustrated in figure 2. According to figure 6, the optimization process of iterative scheme 1 is relatively slow, as its logical error rate decreases only slightly during the first 20 epochs and aligns with other schemes only after the 20th epoch, indicating lower optimization efficiency. In contrast, schemes 1 and 2 exhibit performance comparable to the NBP + CNN + BIAS method, achieving rapid convergence within 5–10 epochs and stabilizing thereafter, demonstrating similar optimization speed and effectiveness. Notably, schemes 3 and 4, along with iterative scheme 3, perform the best. These schemes significantly reduce the logical error rate during the early training stages (1–2 epochs) and converge quickly, showcasing higher optimization efficiency and faster convergence speed, achieving ideal performance in a shorter time. In scenarios with strict time requirements, we recommend selecting scheme 3, as it has relatively low resource consumption.

View original graphic|Download|PPT slide

Figure 6. Comparison of convergence speeds for different convolutional schemes on [[58,16,3]]. All the data results are obtained with a physical error rate of ε = 0.1.

At the end of this section, we consider the threshold of QECCs. The threshold is a key metric for evaluating the error correction capability in quantum computing, defined as the maximum physical error rate ε_th that a system can tolerate under a specific noise model. If the physical error rate ε satisfies ε < ε_th, the logical qubit error rate will decrease exponentially with the depth of the error correction cascade. Conversely, when ε ≥ ε_th, error correction may fail to suppress errors effectively, potentially leading to the accumulation of logical qubit errors. The threshold is a necessary condition for fault-tolerant quantum computation, and its value directly determines the noise tolerance range of the quantum computing system. The decoding thresholds of different decoders were tested under Surface codes with d = 3, 5, 7 to demonstrate the decoding advantages of NBP + CNN + BIAS. See Table 3 and figure 7.

**Table 3. Observed thresholds for numerical simulations of the decoders applied to Surface codes (d = 3, 5, 7). ‘*' represents the case of circuit-level noise.**

Code	decoder	threshold
Surface	BP + OSD [15, 39]	∼14%
—	AMBP [12]	∼16%
—	PyMatching 2.0* [18]	∼1%
—	MWPM [19]	∼15.5%
—	BP-MWPM [20]	∼17.76%
—	Astra [39]	∼17%
—	NBP + CNN + BIAS	∼20%

View original graphic|Download|PPT slide

Figure 7. Decoding Surface codes with NBP + CNN + BIAS.

Note that a quaternary neural network combined with overcomplete check matrices is proposed to decode QECCs in the literature [22]. As shown in figure 8, the decoder in the literature [22] is named NBP4, where the larger numbers represent more rows of overcomplete matrices. We compared with it and found that their decoder focuses more on adding reasonable check rows to improve decoding performance. However, we also get good performance while preserving the original matrices.

View original graphic|Download|PPT slide

Figure 8. Performance comparison of two decoders on [[46,2,9]].

5.3. Decoding performance on biased noise channels

Figure 9(a) shows the decoding curves for the CSS code and the XZZX code under different X-biases with a fixed physical error rate of ε = 0.06. The [[12,2,3]] code provided in [9] is tested here in both the CSS standard form and the XZZX form. Compared to the BP + OSD in [9], our NBP + CNN + BIAS demonstrates significantly better performance, with an order of magnitude improvement in decoding performance. Moreover, under the same quantum noise condition, specifically B_X = 0.5, the logical error rates for both codes are equal. However, as the bias increases, decoding performance begins to diverge between the two codes. For the CSS code, the logical error rate starts to rise and eventually converges to a reachable upper bound. In contrast, increasing the bias of the XZZX code results in an exponential decrease in the logical error rate. This result can be attributed to the fact that the XZZX code decouples to its submatrix under high-bias conditions. The code distance of this quantum code is 3, while the distance d_X under infinite bias for X errors is 6. Thus, the improvement in code performance with increasing bias is as expected.

View original graphic|Download|PPT slide

Figure 9. Standard CSS quantum codes versus bias-tailored XZZX quantum codes. All legends with the suffix ‘BIAS' represent the decoding of the bias-tailored codes using the biased noise model. All the data results are obtained with a physical error rate of ε = 0.06. The OSD orders of [[12,2,3]] and [[129,28,3]] are 7 and 30, respectively.

The standard CSS quantum code [[129,28,3]] can be constructed using the hypergraph product method, which involves two classical BCH codes [7,4,3] and [15,7,5]. Now, according to the algorithm 1, we simulate the data results for the XZZX code [[129,8,3]]. Observing this quantum code, its code distance is 3, and as the X-bias increases infinitely, the error correction performance of the code increasingly relies on its subcode [15,7,5]. In other words, with infinite bias in X errors, the code distance d_X of this code is 5. The test results shown in figure 9(b) demonstrate the significant advantages of the XZZX code. The Hadamard rotation operation reasonably breaks the symmetry of the hypergraph product code, allowing the XZZX code to utilize noise bias to improve decoding performance. Additionally, the NBP + CNN + BIAS algorithm presented in this paper has a notable decoding advantage due to the fact that [[129,28,3]] is a degenerate quantum code [12, 16].

Figure 10 shows the decoding curves of bias-tailored codes [[12,2,3]] and [[24,2]] (provided in [9]) with physical error rates ranging from 0.06 to 0.1. We observe that as the X-bias B_X increases, the error rate decreases exponentially. Additionally, our NBP + CNN + BIAS decoder outperforms BP + OSD, with even the worst-performing curve of our decoder being better than the best-performing BP + OSD curve. This demonstrates that our decoder has a stronger noise resistance capability.

View original graphic|Download|PPT slide

Figure 10. Simulation curves of bias-tailored quantum codes [[12,2,3]] and [[24,2]] under different physical error rates. The OSD orders of [[12,2,3]] and [[24,2]] are 7 and 13, respectively.

6. Conclusions and discussions

Physical qubits are often influenced by biased noise, where one type of error is more likely to occur than others. In this paper, we have proposed a neural network-based decoding algorithm NBP + CNN + BIAS for bias-tailored quantum codes. Extensive simulations and analyses have revealed the following advantages of NBP + CNN + BIAS. We have employed the asymmetry of noise to enhance the decoding performance of bias-tailored QECCs. We have shown that the NBP + CNN + BIAS decoder demonstrates a higher decoding threshold on Surface codes. At last, we have shown that the considered convolutional schemes improve the convergence rate during the training phase of the decoder.

Future research could focus on decoding non-binary quantum codes or longer codes over asymmetric channels, emphasizing efficient algorithms that leverage noise asymmetry and ensure scalability. Investigating the impact of noise bias and exploring advanced optimization strategies, such as neural networks or hybrid methods, may enhance decoding performance which need further researched.

This work was supported by the National Natural Science Foundation of China (Grant Nos. 62371240, 61802175, 62401266, and 12201300), the National Key R&D Program of China (Grant No. 2022YFB3103800), the Natural Science Foundation of Jiangsu Province (Grant No. BK20241452), the Fundamental Research Funds for the Central Universities (Grant No. 30923011014), and the fund of Laboratory for Advanced Computing and Intelligence Engineering (Grant No. 2023-LYJJ-01-009).

References

Publishing order | Descend order by publishing year | Descend order by cited within

1	Akhter Hossain K 2023 The potential and challenges of quantum technology in modern era Sci. Res. J. 11 41 49 DOI

2	Möller M, Vuik C 2017 On the impact of quantum computing technology on future developments in high-performance scientific computing Ethics Inf. Technol. 19 253 269 DOI

3	Sonko S, Ibekwe K I, Ilojianya V I, Etukudoh E A, Fabuyide A 2024 Quantum cryptography and us digital security: a comprehensive review: investigating the potential of quantum technologies in creating unbreakable encryption and their future in national security Comput. Sci. IT Res. J. 5 390 414 DOI

4	Roffe J 2019 Quantum error correction: an introductory guide Contemp. Phys. 60 226 245 DOI

5	Yang Z, Zolanvari M, Jain R 2023 A survey of important issues in quantum computing and communications IEEE Commun. Surv. Tutorials 25 1059 1094 DOI

6	Aliferis P, Brito F, DiVincenzo D P, Preskill J, Steffen M, Terhal B M 2009 Fault-tolerant computing with biased-noise superconducting qubits: a case study New J. Phys. 11 013061 DOI

7	Lescanne Raphaël, Villiers M, Peronnin Théau, Sarlette A, Delbecq M, Huard B, Kontos T, Mirrahimi M, Leghtas Z 2020 Exponential suppression of bit-flips in a qubit encoded in an oscillator Nat. Phys. 16 509 513 DOI

8	Bonilla Ataides J P, Tuckett D K, Bartlett S D, Flammia S T, Brown B J 2021 The XZZX surface code Nat. Commun. 12 2172 DOI

9	Roffe J, Cohen L Z, Quintavalle A O, Chandra D, Campbell E T 2023 Bias-tailored quantum LDPC codes Quantum 7 1005 DOI

10	Kuo K-Y, Lai C-Y 2020 Refined belief propagation decoding of sparse-graph quantum codes IEEE J. Sel. Areas Inf. Theory 1 487 498 DOI

11	Renes J M 2017 Belief propagation decoding of quantum channels by passing quantum messages New J. Phys. 19 072001 DOI

12	Kuo K-Y, Lai C-Y 2022 Exploiting degeneracy in belief propagation decoding of quantum codes npj Quantum Inf. 8 111 DOI

13	Chytas D, Pacenti M, Raveendran N, Flanagan M F, Vasić B 2024 Enhanced message-passing decoding of degenerate quantum codes utilizing trapping set dynamics IEEE Commun. Lett. DOI

14	Huang T-H, Ueng Y-L 2024 A binary BP decoding using posterior adjustment for quantum LDPC codes ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 9001 9005 DOI

15	Roffe J, White D R, Burton S, Campbell E 2020 Decoding across the quantum low-density parity-check code landscape Phys. Rev. Res. 2 043423 DOI

16	Panteleev P, Kalachev G 2021 Degenerate quantum LDPC codes with good finite length performance Quantum 5 585 DOI

17	Higgott O 2022 Pymatching: a python package for decoding quantum codes with minimum-weight perfect matching ACM Trans. Quantum Comput. 3 1 16 DOI

18	Higgott O, Gidney C 2025 Sparse blossom: correcting a million errors per core second with minimum-weight matching Quantum 9 1600 DOI

19	Wang D S, Fowler A G, Stephens A M, Hollenberg L C L 2009 Threshold error rates for the toric and surface codes arXiv:0905.0531

20	Criger B, Ashraf I 2018 Multi-path summation for decoding 2D topological codes Quantum 2 102 DOI

21	Liu Y-H, Poulin D 2019 Neural belief-propagation decoders for quantum error-correcting codes Phys. Rev. Lett. 122 200501 DOI

22	Miao S, Schnerring A, Li H, Schmalen L 2023 Neural belief propagation decoding of quantum LDPC codes using overcomplete check matrices 2023 IEEE Information Theory Workshop (ITW), IEEE 215 220

23	Ji N, Chen Z, Qu Y, Bao R, Yang X, Wang S 2023 Fault-tolerant quaternary belief propagation decoding based on a neural network Frontiers Phys. 11 1164567 DOI

24	Sarker I H 2021 Machine learning: algorithms, real-world applications and research directions SN Comput. Sci. 2 160 DOI

25	Sharifani K, Amini M 2023 Machine learning and deep learning: A review of methods and applications World Inf. Technol. Eng. J. 10 3897 3904

26	Mendonça M O K, Netto S L, Diniz P S R, Theodoridis S 2024 Signal Processing and Machine Learning Theory Amsterdam Elsevier 13 869–959

27	Krenn M, Landgraf J, Foesel T, Marquardt F 2023 Artificial intelligence and machine learning for quantum technologies Phys. Rev. A 107 010101 DOI

28	Marquardt F 2021 Machine learning and quantum devices SciPost Phys. Lecture Notes 29 1-44 DOI

29	Gong A, Cammerer S, Renes J M 2024 Graph neural networks for enhanced decoding of quantum LDPC codes 2024 IEEE International Symposium on Information Theory (ISIT) 2700 2705

30	Buchberger A, Häger C, Pfister H D, Schmalen L, Graell i Amat A 2021 Learned decimation for neural belief propagation decoders ICASSP 2021-2021 IEEE Interenational Conference on Acoustics, Speech and Signal Processing (ICASSP) 8273 8277

31	Varsamopoulos S, Bertels K, Almudever C G 2019 Comparing neural network based decoders for the surface code IEEE Trans. Comput. 69 300 311 DOI

32	Andreasson P, Johansson J, Liljestrand S, Granath M 2019 Quantum error correction for the toric code using deep reinforcement learning Quantum 3 183 DOI

33	Choukroun Y, Wolf L 2024 Deep quantum error correction Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 64 72 DOI

34	Calderbank A R, Shor P W 1996 Good quantum error-correcting codes exist Phys. Rev. A 54 1098 DOI

35	Grospellier A, Krishna A 2018 Numerical study of hypergraph product codes arXiv:1810.03681

36	Panteleev P, Kalachev G 2022 Quantum LDPC codes with almost linear minimum distance IEEE Trans. Inf. Theory 68 213 229 DOI

37	Kuo K-Y, Chern I-C, Lai C-Y 2021 Decoding of quantum data-syndrome codes via belief propagation 2021 IEEE International Symposium on Information Theory (ISIT) 1552 1557

38	Raveendran N, Rengaswamy N, Pradhan A K, Vasić B 2022 Soft syndrome decoding of quantum LDPC codes for joint correction of data and syndrome errors 2022 IEEE International Conference on Quantum Computing and Engineering (QCE) 275 281

39	Singh Maan A, Paler A 2024 Machine learning message-passing for the scalable decoding of QLDPC codes npj Quantum Inf. 11 1 8 DOI

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

1. Introduction

2. Preliminaries

2.1. Calderbank-shor-steane codes

2.2. Bias-tailored XZZX codes

3. Neural belief propagation decoder

Figure 1. The structure of NBP + BIAS decoder. The decoder mainly consists of a biased noise model and a neural belief propagation (NBP) decoder using a multi-objective loss function.

Figure 2. Convolutional optimization schemes. The symbol ‘⋯' represents a repeating structure and ‘T' is the number of repetitions.

3.1. Biased noise model

3.2. Neural belief propagation

3.3. Multi-objective loss function

4. Convolutional optimization schemes

Figure 3. Structure of convolutional step.

5. Numerical results

5.1. Experimental environment and parameter definitions

Table 1. Experimental setup.

Table 2. Statistics of classical error rate and logical error rate for bias-tailored quantum code [[12, 2, 3]] under different hyperparameters a. The physical error rate ε is 0.06.

5.2. Decoder performance optimization

Figure 4. Visualization of the spatial structure of convolutional kernels. All the data results are obtained with a physical error rate of ε = 0.1.

Figure 5. Histogram of the weight distribution of convolutional kernels. All the data results are obtained with a physical error rate of ε = 0.1.

Figure 6. Comparison of convergence speeds for different convolutional schemes on [[58,16,3]]. All the data results are obtained with a physical error rate of ε = 0.1.

Table 3. Observed thresholds for numerical simulations of the decoders applied to Surface codes (d = 3, 5, 7). ‘*' represents the case of circuit-level noise.

Figure 7. Decoding Surface codes with NBP + CNN + BIAS.

Figure 8. Performance comparison of two decoders on [[46,2,9]].

5.3. Decoding performance on biased noise channels

Figure 10. Simulation curves of bias-tailored quantum codes [[12,2,3]] and [[24,2]] under different physical error rates. The OSD orders of [[12,2,3]] and [[24,2]] are 7 and 13, respectively.

6. Conclusions and discussions

References

**2.2. Bias-tailored XZZX codes**

**Table 3. Observed thresholds for numerical simulations of the decoders applied to Surface codes (d = 3, 5, 7). ‘*' represents the case of circuit-level noise.**