Welcome to visit Communications in Theoretical Physics,
Gravitation Theory, Astrophysics and Cosmology

CMB delensing with deep learning

  • Shulei Ni 1, 2 ,
  • Yichao Li 1 ,
  • Xin Zhang , 1, 3, 4, *
Expand
  • 1Liaoning Key Laboratory of Cosmology and Astrophysics, College of Sciences, Northeastern University, Shenyang 110819, China
  • 2Research Center for Astronomical Computing, Zhejiang Laboratory, Hangzhou 311121, China
  • 3MOE Key Laboratory of Data Analytics and Optimization for Smart Industry, Northeastern University, Shenyang 110819, China
  • 4National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang 110819, China

Author to whom any correspondence should be addressed.

Received date: 2025-07-16

  Revised date: 2025-10-09

  Accepted date: 2025-10-09

  Online published: 2025-12-16

Copyright

© 2025 Institute of Theoretical Physics CAS, Chinese Physical Society and IOP Publishing. All rights, including for text and data mining, AI training, and similar technologies, are reserved.
This article is available under the terms of the IOP-Standard License.

Cite this article

Shulei Ni , Yichao Li , Xin Zhang . CMB delensing with deep learning[J]. Communications in Theoretical Physics, 2026 , 78(3) : 035405 . DOI: 10.1088/1572-9494/ae1940

1. Introduction

The anisotropies of the cosmic microwave background (CMB) offer crucial insights into the early Universe. Observations of such anisotropies have made significant contributions in establishing the current standard model of cosmology. In the last few decades, three generations of satellite experiments, namely COBE [1-3], WMAP [4-6], and Planck [7-9], as well as numerous ground-based experiments (e.g., DASI [10], CBI [11], SPTpol [12], BICEP [13, 14], Keck Array [15], ACT [16], and Ali-CPT [17]) and balloon experiments (e.g., BOOMERanG [18], EBEX [19], and SPIDER [20]), have been carried out to precisely measure the temperature and polarization power spectrum of the CMB. Moreover, there are several ongoing plans for diverse experimental projects, including LiteBIRD [21], CMB-S4 [22, 23], AliCPT [17], and PIPER [24], being devoted to the detection of CMB polarization signals.
The generation of stochastic background of gravitational waves, known as primordial gravitational waves (PGWs), is a fundamental prediction in any cosmological inflation model [25, 26]. The characteristic of this signal encodes unique information about the physics of the early Universe and its subsequent evolution, providing an exciting and powerful window into the Universe's origin and evolution.
The tensor-to-scalar ratio r parameterizes the amplitude of PGWs and connects to the energy scale at which inflation occurred. Therefore, the detection of PGWs is expected to reveal the physical properties of the Universe's early stages. Fortunately, these PGWs have distinct imprints in the polarized anisotropy of the CMB, displaying a spiral-like B-mode pattern that enables the extraction of the PGW signal from the polarized B-mode of the CMB [22, 27]. The detection of B-mode signals presents an exceptionally challenging task, particularly due to the uncertain extent of foreground contamination and the mixing of relatively strong E-mode signals with B-mode signals induced by weak gravitational lensing [27]. Therefore, the primary scientific objectives of CMB polarization observations currently entail determining the extent of foreground contamination and precisely measuring the effect of gravitational lensing [28]. Our research primarily centers on the gravitational lensing phenomenon associated with CMB.
CMB lensing has been extensively studied in theory [29-31]. After decoupling from matter at the last scattering surface, CMB photons propagate freely and are gravitationally deflected by the Universe's large-scale distribution of matter. The lensing effect results in subtle imprints on the CMB temperature and polarization anisotropies. These can be utilized to create a map of the lensing potential that determines the lensing deflections' gradient [32]. Weak gravitational lensing smoothes the acoustic peaks of the CMB angular power spectra, transferring power from larger angular scales to smaller scales, and converting E-mode polarization to B-mode polarization [33, 34]. In addition, weak gravitational lensing can induce small distortions in the CMB, and these distortions can be detected by modifying its primordial morphology in an anisotropic form.
Given that the anisotropy during the final scattering can be approximated as Gaussian, the non-Gaussian structures observed in the lensed CMB sky provide additional information about the properties of gravitational lensing material. This is crucial for a further understanding of important details regarding the distribution of matter [29, 35, 36]. Therefore, weak gravitational lensing of the CMB both promotes and hinders our comprehension of the history and content of the Universe.
There are several advantages associated with mitigating the impacts of lensing on the observed CMB temperature and polarization maps [29, 34, 36, 37]. Firstly, the average temperature of the CMB remains unchanged by lensing as the gravitational effect only adjusts arrival directions and not the surface brightness [36, 37]. Secondly, the distortion of light paths traveling from far sources to reach us is caused by the gravitational effect of the Universe's inhomogeneities. The reason lensing is so promising is that it enables probing of all clustering stress-energy components in the Universe through space-time perturbations since light paths react to mass. Measurement of these distortions provides insight into the mass distribution of the Universe [29, 36, 37].
However, gravitational lensing has the potential to smoothen acoustic peaks in the CMB. This can be comprehended by considering that when photons get deflected, features of a fixed angular size can either be magnified or de-magnified, causing sharp features in the power spectrum to be blurred across a range of scales. The angular scale of sharp features in the power spectrum is easier to measure than broad humps; consequently, gravitational lensing weakens our ability to precisely measure acoustic peak positions in the CMB power spectra [29, 37]. The temperature power spectrum undergoes lensing by 0.2% at $\ell$ ~ 2000, but smaller scales experience changes at the percentage level. The impact on the B-mode polarization power spectrum is more significant, with power increase of 6% across all scales [29].
Delensing reverses this peak smoothing, providing sharper peaks with a more precisely measurable angular scale. Similar observations can be made about the measurement of peak heights. Weak gravitational lensing alters the CMB power spectra, induces non-Gaussian features, and generates a B-mode polarization signal, which causes confusion for the signal from PGWs [29, 37].
In recent years, deep learning and artificial intelligence techniques have developed rapidly, gaining widespread attention and significant application in various disciplines. Astronomy research has been catching up, and many studies have emerged, applying deep learning for data analysis [38-48]. These studies have demonstrated the high effectiveness of deep learning in image reconstruction and segmentation tasks, enabling the detection of features at the pixel level [49-51].
This work adopts the sky map segmentation method proposed by Makinen [44]. The method relies on the HEALPix pixelization scheme, which divides the two-dimensional spherical sky map into multiple two-dimensional plane images. We apply this sky map segmentation method to attempt a novel CMB delensing approach that can remove the lensing effect.
The paper is organized as follows. In section 2, we present the simulation of the CMB and the preprocessing of simulated data. In section 3, we describe the methods of quadratic estimator (QE) delensing and UNet++ delensing. In section 4, we provide a comprehensive analysis of the obtained results. Finally, in section 5 we provide the concluding remarks.

2. Data simulation and preprocessing

In this section, we describe the simulated CMB and noisy sky maps used for our analysis. This includes detailed information on lensed temperature and polarization maps with noise and instrumental effects, as well as unlensed temperature and polarization maps.
For our simulations, we employed the publicly available package, Lenspyx5 [37, 52], a Python package specifically designed for simulating lensed CMB maps on a curved sky. We also relied on three additional software packages, namely CAMB6 [35], LensPix7 [37], and Healpy8 [53, 54], as the foundational components for developing detailed curved-full-sky simulations of both lensed and unlensed CMB. By incorporating the Planck 2018 CMB lensing pipeline (plancklens), Lenspyx has the ability to replicate both the published map and band-powers.
We have employed a concise formula to generally describe our data simulation process, which is presented as follows:
$\begin{eqnarray}{{ \mathcal M }}_{{\rm{lensed}}}^{{ \mathcal F }}={\rm{Smooth}}[{\rm{Lenspyx}}({{ \mathcal M }}_{{\rm{unlensed}}}^{{ \mathcal F }})+{{ \mathcal M }}_{{\rm{N}}}^{{ \mathcal F }}],\end{eqnarray}$
where ${ \mathcal M }$ represents the full-sky map and ${ \mathcal F }\in \{{\rm{T}},{\rm{Q}},{\rm{U}}\}$. Smooth represents the telescope's beam, and we have chosen a simple Gaussian beam. Lenspyx is an abbreviation for the process of converting unlensed maps into lensed maps. N represents the noise component in the image. We also considered the impact of instrumental effects and used a Gaussian beam with a full width at half maximum (FWHM) of 7.9 arcmins to simplify the processing.

2.1. CMB temperature and lensing potential

The CMB radiation field is represented by the temperature anisotropy, denoted as $T(\hat{n})$, and the polarization, denoted as $P(\hat{n})$, in the spatial direction $\hat{n}$ of the celestial sphere. We observe CMB temperature changes projected on a 2D spherical surface sky, and it is now habitual in the literature to expand the temperature field using spherical harmonics.
The temperature fluctuation of the CMB on the spherical surface is described by a scalar field, consisting of small fluctuations ${\rm{\Delta }}T(\hat{n})$ at a level of 10-5 relative to the average value T0 = 2.725 K. For full-sky observations, the temperature field T can be expressed through the spherical harmonic decomposition using spin-0 spherical harmonics ${Y}_{{\ell }m}(\hat{n})$,
$\begin{eqnarray}{\rm{\Delta }}T(\hat{n})=\displaystyle \sum _{{\ell }=0}^{{l}_{\max }}\displaystyle \sum _{m=-{\ell }}^{m={\ell }}{a}_{{\ell }m}{Y}_{{\ell }m}(\hat{n}).\end{eqnarray}$
All the information contained in the temperature field $T(\hat{n})$ is included in the space-time-dependent amplitudes a$\ell$m, where a$\ell$m represents the spherical harmonic coefficient and can be expressed using the following formula,
$\begin{eqnarray}{a}_{{\ell }m}=\int {\rm{d}}{{\rm{\Omega }}}_{n}{\rm{\Delta }}T(\hat{n}){Y}_{{\ell }m}^{* }(\hat{n}),\end{eqnarray}$
and analogously to the methodology in Fourier space, we can define an angular power spectrum for these fluctuations, denoted as C$\ell$, by calculating the variance of the harmonic coefficients,
$\begin{eqnarray}\langle {a}_{{\ell }m}{a}_{{\ell }^{\prime} {m}^{{\prime} }}^{* }\rangle ={\delta }_{{\ell }{{\ell }}^{{\prime} }}{\delta }_{m{m}^{{\prime} }}{C}_{{\ell }},\end{eqnarray}$
where the above average is taken over many ensembles and the delta functions arise from isotropy. We can write the following expression for the angular power spectrum,
$\begin{eqnarray}{C}_{{\ell }}^{TT}=\frac{1}{2{\ell }+1}\displaystyle \sum _{m}\langle {a}_{{\ell }m}^{T}{{a}_{{\ell }m}^{T}}^{* }\rangle .\end{eqnarray}$
We can simulate the angular power spectrum C$\ell$ with CAMB or CLASS9 [55, 56], as shown in figure 1. The figure demonstrates that as the scale decreases, the difference between lensed and unlensed data increases. This phenomenon occurs due to the deflection imparted on CMB photons by each encountered potential, resulting in an accumulation of effects on the CMB power spectrum at small scales. Consequently, the lensing effect distorts the original power spectrum and introduces non-Gaussianity in the lensed CMB [29]. It should be noted that non-linear evolution also contributes to an increase in power on smaller scales.
Figure 1. CMB XX angular power spectrum, X ∈ {T, E, B}. Top panel: the red solid line represents the lensed TT spectrum, while the light red dashed line corresponds to the unlensed TT spectrum. The green solid line denotes the EE spectrum under the influence of lensing, whereas the light green dashed line shows the EE spectrum without lensing effects. The blue solid line indicates the lensing-modified BB spectrum, and the light blue dashed line depicts the BB spectrum in the absence of lensing effects. The parameters for generating the temperature angular power spectrum are As = 2.1 × 10-9 and r = 0.005. The black solid line displays the noise level of the temperature detectors, and the grey solid line represents that of the polarization detectors. Bottom panel: this illustrates the relative differences between these highly similar spectra. The red dashed line marks the ratio of the unlensed TT spectrum to the lensed TT spectrum; the green dashed line indicates the ratio of the unlensed EE spectrum to the lensed EE spectrum; and the blue dashed line shows the ratio of the unlensed BB spectrum to the lensed BB spectrum.
Weak lensing of the CMB deflects photons coming from original direction ${\hat{n}}^{{\prime} }$ on the last scattering surface to direction $\hat{n}$ on the observed sky, so a lensed CMB temperature field, T(θ, Φ), is given by $\widetilde{X}(\hat{n})=X({\hat{n}}^{{\prime} })$ in terms of the unlensed field X = T [37]. Thus the position in the sky where we finally see the CMB photons is determined by the integral of the gravitational potential along the line of sight to the last scattering surface.
We propose the definition of an integrated lensing potential, denoted as ψ. The deflection vector is expressed as the gradient of the lensing potential, ${\rm{\nabla }}\psi (\hat{n})$, where $\text { ∇ }$ represents the covariant derivative on the sphere. The vector ${\hat{n}}^{{\prime} }$ is derived from $\hat{n}$ by moving its geodesic at one end of the surface of the unit sphere along the ${\hat{n}}^{{\prime} }$ direction by a distance ${\rm{\nabla }}\psi (\hat{n})$. Then the unlensed CMB photon with direction $\hat{n}$ becomes a lensed CMB photon with direction $\hat{n}={\hat{n}}^{{\prime} }+{\rm{\nabla }}\psi (\hat{n})$, through the weak gravitational lensing. Thus, the lensed CMB temperature can be written as follows,
$\begin{eqnarray}\widetilde{T}(\hat{n})=T({\hat{n}}^{{\prime} })=T(\hat{n}+{\rm{\nabla }}\psi (\hat{n})),\end{eqnarray}$
where lensing potential $\psi (\hat{n})$ can be defined as [29, 57],
$\begin{eqnarray}\psi (\hat{n})=2{\int }_{0}^{{\chi }_{\star }}{\rm{d}}\chi \left(\frac{{\chi }_{\star }-\chi }{{\chi }_{\star }\chi }\right){\rm{\Psi }}(\hat{n},\eta ),\end{eqnarray}$
where η is the conformal time, χ is the comoving distance, and $\Psi$ is the Bardeen potential. As with the CMB temperature angular power spectrum, the same angular power spectrum of the gravitational potential can be obtained,
$\begin{eqnarray}{C}_{{\ell }}^{\psi \psi }=16\pi \int \frac{{\rm{d}}k}{k}{P}_{{\rm{R}}}(k){\left({\int }_{0}^{{\chi }_{\star }}{\rm{d}}\chi \left(\frac{{\chi }_{\star }-\chi }{{\chi }_{\star }\chi }\right)T(k,\eta ){k}_{{\ell }}(k\chi )\right)}^{2},\end{eqnarray}$
where T(k, η) is the appropriate transfer function, and PR(k) is the primordial power spectrum.
We utilized Lenspyx to simulate lensed CMB and unlensed CMB temperature and polarization sky maps. In our simulations, we utilized the best-fit cosmological parameters from the Planck 2018 $\Lambda$CDM results [9], i.e., H0 = 67.7 km s-1 Mpc-1, Ωb = 0.049, Ωm = 0.311, Ω$\Lambda$ = 0.689, and σ8 = 0.81. For the simulation of the temperature sky map, we run the parameters scalar amplitude (As) and ns, as they are more sensitive to it, while for the polarization, we only concentrate on the tensor-to-scalar ratio (r). Both temperature and polarization simulations were conducted autonomously.
To acquire additional datasets and prepare for subsequent polarization analysis, we conducted experiments using two key cosmological parameters: As and r, both of which play crucial roles in determining the temperature and polarization power spectra of the CMB. Specifically, we sampled five values of As (2.0 × 10-9 to 2.2 × 10-9) and ten values of r linearly spaced between 0.001 and 0.01, resulting in a total of 30 CMB TT angular power spectra and corresponding sky maps.
In the top panel of figure 1, the red solid line and the light red dashed line illustrate the lensed TT angular power spectrum and the unlensed TT power spectrum, with the parameters As = 2.1 × 10-9 and r = 0.005, respectively. The ratio between them is shown by a red dashed line in the bottom panel. On large scales, the anisotropies of the CMB are primarily dominated by emissions from the last scattering surface at a redshift of z ~ 1100. However, on smaller scales, the CMB is more significantly influenced by what are known as secondary effects. These secondary anisotropies arise as a result of interactions between CMB photons and matter along the line of sight.
Our work is grounded in the analysis of full-sky maps, thus requiring the use of the Lenspyx tool to transform the generated 30 angular power spectra into corresponding temperature sky maps. To investigate the impact of lensing potential on small-scale structures, we conducted simulation experiments at a high resolution of Nside = 2048. To ensure that the simulation results more closely match actual observational data, we accounted for the smoothing effect of the telescope beam on signals during the computation process using a FWHM of θFWHM = 8.3 arcmins.

2.2. CMB polarization

CMB polarization is measured through time-averaged Stokes parameters——measures of linear polarization of the electric field aligned orthogonally to the Cartesian axes of the line of sight (LOS) . Due to the nature of Thomson scattering, Q and U are ample in describing CMB polarization, given their inability to generate circular polarization [34]. Astronomical observations reveal that Stokes parameters Q and U are linked by a constant and relative 45° degree rotation around the LOS——whereas the reference frame may revolve freely around it. Bundling the reference frame with a fixed set of normal vectors in rotation will directly link Q and U to the E- and B-modes.
In this subsection, we provide a concise overview of the polarization field properties and its decomposition into physically distinct E- and B-modes. We express the standard construction of E and B fields in terms of the spin-raising and spin-lowering operators, usually implemented in harmonic space.
If we rotate Q and U by an angle α on the plane that is perpendicular to the direction of $\hat{n}$, we obtain the following solution [58-60],
$\begin{eqnarray}{(Q\pm {\rm{i}}U)}^{{\prime} }(\hat{n})={{\rm{e}}}^{\mp 2{\rm{i}}\alpha }(Q\pm {\rm{i}}U)(\hat{n}).\end{eqnarray}$
We can derive the separation of E- and B-modes from the Stokes parameters [61-65]. Here, we provide a brief overview of the standard method. Additionally, we can decompose Q and U into ±2 spin spherical harmonics concerning rotation as shown below [58, 59],
$\begin{eqnarray}Q(\hat{n})\pm {\rm{i}}U(\hat{n})=\displaystyle \sum _{{\ell },m}{a{}_{\pm 2,{\ell }m}{}_{\pm }}_{2}{Y}_{{\ell }m}(\hat{n}),\end{eqnarray}$
where ${{}_{\pm }}_{2}{Y}_{{\ell }m}(\hat{n})$ are the spin ±2 spherical harmonics, and the coefficients a±2,$\ell$m are given by
$\begin{eqnarray}{a}_{\pm 2,{\ell }m}=\int {(Q(\hat{n})\pm {\rm{i}}U(\hat{n})){}_{\pm }}_{2}{Y}_{{\ell }m}^{\star }(\hat{n}){\rm{d}}\hat{n}.\end{eqnarray}$
The E and B modes in the spherical harmonic space are formed by
$\begin{eqnarray}\begin{array}{rcl}{a}_{{\ell }m}^{E} & = & -({a}_{2,{\ell }m}+{a}_{-2,{\ell }m})/2,\\ {a}_{{\ell }m}^{B} & = & -({a}_{2,{\ell }m}-{a}_{-2,{\ell }m})/2{\rm{i}}.\end{array}\end{eqnarray}$
By applying the angular correlation function, the sum can be reduced to an expression that only involves $\ell$ and a power spectrum term
$\begin{eqnarray}\begin{array}{rcl}{C}_{{\ell }}^{TE} & = & \frac{1}{2{\ell }+1}\displaystyle \sum _{m}\langle {a}_{{\ell }m}^{T}{{a}_{{\ell }m}^{E}}^{* }\rangle ,\\ {C}_{{\ell }}^{EE} & = & \frac{1}{2{\ell }+1}\displaystyle \sum _{m}\langle {a}_{{\ell }m}^{E}{{a}_{{\ell }m}^{E}}^{* }\rangle ,\\ {C}_{{\ell }}^{BB} & = & \frac{1}{2{\ell }+1}\displaystyle \sum _{m}\langle {a}_{{\ell }m}^{B}{{a}_{{\ell }m}^{B}}^{* }\rangle .\end{array}\end{eqnarray}$
Based on the aforementioned theoretical derivations, we employed the same methodology used for simulating temperature sky maps to simulate polarized sky maps. We conducted tests on 30 datasets utilizing two critical parameters, the equation-of-state parameter of dark energy w (-1.025, -1, -0.975) and tensor-to-scalar ratio r (0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01). Similar to the simulation of temperature sky maps, each distinct set of parameters were employed to generate a corresponding full-sky map.
In the top panel of figure 1, we use a green solid line and a light green dashed line to show the lensed and unlensed angular power spectra of the E-mode, respectively, and a blue solid line and a light blue dashed line to show the lensed and unlensed angular power spectra of the B-mode, with the equation-of-state parameter for dark energy w = -1 and the scalar-to-tensor ratio r = 0.005, respectively. Meanwhile, in the bottom panel, a green dashed line shows the ratio of the lensed to the unlensed E-mode power spectra, and a blue dashed line shows the ratio of the lensed to the unlensed B-mode power spectra. Similar to the simulation of the temperature sky maps, we utilized the Lenspyx tool to generate 30 sky maps corresponding to the polarization power spectra.
After obtaining the full-sky maps of the Stokes parameters T, Q, and U, we transformed them into the T, E, and B components in harmonic space using equation (12). This conversion enables a more direct analysis of the physical polarization modes and separates the scalar (E-mode) and tensor (B-mode) contributions. The resulting T/E/B maps serve as the input for our neural network, which is specifically designed to operate on the T/E/B fields for subsequent data processing and training.

2.3. Noise simulation

In the course of the actual observation of the CMB, it is crucial to consider various sources of noise interference. Given the faintness of the CMB signal, noise has the potential to mask or distort its true characteristics, leading to deviations in the measurement of core observables such as temperature and polarization. Moreover, the presence of noise can introduce systematic errors that impact our ability to infer cosmological parameters.
The noise in CMB detectors is often approximated as Gaussian white noise, and its angular power spectral density can be defined in the following [66-69]:
$\begin{eqnarray}{N}_{{\ell }}^{X{X}^{{\prime} }}={\left[\Space{0ex}{2.5ex}{0ex}{s}^{2}{{\rm{e}}}^{\left(-{\ell }({\ell }+1)\frac{{\theta }_{{\rm{FWMH}}}^{2}}{8\mathrm{ln}2}\right)}\Space{0ex}{2.5ex}{0ex}\right]}^{-1},\end{eqnarray}$
where s denotes the telescope's sensitivity, X ∈ {T, E, B}, and θFWHM represents the FWHM of the telescope. For simplicity, we adopt ${\theta }_{{\rm{FWHM}}}=7.9\,{\rm{arcmin}}$, with a sensitivity for the temperature detector of 1.5 μK arcmins and for the polarization detector of 2.1 μK arcmins [70].
According to equation (14), we can calculate the temperature and polarization angular power spectra of noise, as illustrated by the black and grey lines in the top panel of figure 1. For the noise of each component, we generate corresponding full-sky maps based on their angular power spectra and overlay these noise maps onto the sky maps with lensing effects. Similar to the simulation of the temperature sky maps, we utilized the Lenspyx tool to generate 30 sky maps corresponding to the noise power spectra.

2.4. Data preprocessing

By conducting simulations with Lenspyx, we generated 30 sets of full-sky maps for Stokes parameters I, Q, U, along with noise maps. Using equation (1), we were then able to obtain the sky maps corresponding to our training data, i.e. T/E/B fields. Spherical data poses challenges for deep learning due to its non-Euclidean structure. Traditional convolutions are not directly applicable on the sphere because of the lack of translational equivariance, and the rotational symmetry on the sphere is also difficult to handle. Moreover, spherical sampling is often irregular, such as in the HEALPix grid, which complicates feature extraction.
There are currently available network solutions for deep learning on the sphere, such as DeepSphere10 [71] and NNhealpix11 [72]. Both methods are highly innovative and have made outstanding contributions to the application of artificial intelligence algorithms on spherical data. We have attempted both methods, but the extremely high resolution of the full-sky map we are considering, i.e., Nside = 2048, makes both methods extremely time-consuming during the data initialization stage. However, regardless of the segmentation scheme used, if the full-sky map is not directly used as training data, boundary processing issues will inevitably arise. Therefore, we require more effective data processing and model training methods to address this challenge. Additionally, due to the fact that not all pixels on HEALPix have exactly four neighboring pixels, as in some cases there are only three, this may also introduce errors in certain segmentation approaches [72], as shown in figure 2.
Figure 2. A half-orthogonal view projection of the sky map. Gray numbers sequentially label pixels with four neighbors, while larger black numbers highlight those with only three.
When there is a mismatch between data and model, there are usually two ways to deal with it: modifying the model to fit the data, or adjusting the data format to meet the model requirements. In related research such as DeepSphere and NNhealpix, the first strategy is used, which is to modify the structure of the deep learning model to adapt to spherical data, so that the network can complete training. However, there are certain limitations in such methods, so we propose an alternative approach that segments spherical data to more flexibly address the mismatch between data and models. Our segmentation scheme builds upon Makinen's work [44]. Specifically, we approximate each pixel at a resolution of Nside = 4 as a flat region, as illustrated by the labeled sky patches in figure 2. Consequently, the entire sky is divided into 192 independent small sky region images.
For simulated sky maps with a resolution of Nside = 2048, after applying the aforementioned segmentation method, each full sky map is broken down into 192 image patches (number of pixels at resolution Nside = 4), each sized 512 × 512 pixels, as shown in figure 3. The sky patches in the first row are used to form the training dataset, while the corresponding labels are contained within the sky patches in the second row.
Figure 3. The T, E, and B sky map patches. Each patch has a size of 214.86deg2. The center of the sky patches is (l, b) = (101.25°, 19.471°), where l and b are the Galactic longitude and Galactic latitude, respectively. From left to right, the sky patches are the T, E, and B maps, respectivly. The top row of patches displays sky map patches affected by lensing effects, including noise and instrumental artifacts, which are used to construct the training dataset. The second row presents the original, unlensed sky map patches, which serve as the basis for generating the label dataset. The unit is μK.
At any observational scale, cosmology inherently encompasses a wealth of physical information. Assuming that we partition a full-sky map into multiple independent regions where there is no scale-dependent coupling of information among these regions, any form of segmentation will inevitably entail some degree of informational loss, and our segmentation approach is no exception. Therefore, we have proposed a mitigation strategy to address this issue.
We constructed a technical framework based on a pixel-level averaging fusion method, consisting of the following key steps. First, the original full-sky CMB map is randomly rotated by predefined angles. The rotated full-sky map is then divided into map patches. Subsequently, an convolutional neural network (CNN) network architecture is applied to perform delensing on each segmented image patch, generating corresponding intermediate delensed results. After the forward processing is completed, inverse rotation operations are performed on each intermediate result to restore the original spatial orientation.
By repeating this entire process 30 times, and given that the segmentation edges vary in each iteration due to the specific rotations, the final aggregated result effectively alleviates the errors introduced by any single segmentation, thereby significantly mitigating the errors caused by the segmentation process.
This operation is applied independently to each field of the sky maps, requiring 30 rotational iterations per sky map type with corresponding repetition of the training process. The rotation angles configured with Φ values are [-180°, -120°, -60°, 0°, 60°, 120°], and θ values are [-60°, -30°, -0°, 30°, 60°].
The rotation scheme aims to mitigate the error impact caused by the segmentation operation to a certain degree, but it cannot completely eliminate it. Additionally, the implementation of multiple image rotation processes necessitates the repetition of the corresponding training process, significantly increasing the demands on computational resources and time costs. Consequently, in our research, we limited the number of rotation operations to a total of 30 for the training dataset.
It is worth noting that since E-mode and B-mode represent curl-free and divergence-free polarization fields, respectively, they can effectively separate signals of different physical origins. In particular, B-mode produced by PGWs and those induced by gravitational lensing effects can be more clearly distinguished and processed in T/E/B space. Therefore, using the T/E/B representation for CMB delensing is the optimal choice. After obtaining the I, Q, U Stokes parameters that have been affected by noise and instrumental effects, we can derive the T-, E-, and B-mode sky maps using equations (4) and (12).

3. QE delensing and UNet++ structure

In the field of cosmic signal analysis and intelligent image processing, the quadratic estimator (QE) algorithm, a secondary estimator, relies on mathematical statistics to mine deep correlations in observed signals, aiding in the inversion of cosmological parameters [30, 73, 74]. The UNet++ algorithm achieves high-precision image segmentation through deep convolutional networks and multiscale feature fusion [40, 42]. Although the two have different paths, the former is based on physical statistical modeling, while the latter relies on data-driven learning, they jointly build a core methodology for extracting key information from complex data. We subsequently applied these two methods to delens CMB and then compared the discrepancies in their processing outcomes.

3.1. QE delensing

QE delensing is a powerful technique that reconstructs and removes the gravitational lensing signal imprinted on the CMB by large-scale structure, thereby sharpening acoustic features and reducing spurious B-mode power. At its core, this method uses optimal quadratic combinations of temperature (T) and polarization (E, B) multipoles to estimate the lensing potential, then 'undoes' the inferred deflections to recover a closer to primordial map [52].
In practice, the QE algorithm reconstructs the lensing potential Φ by exploiting the off-diagonal covariance induced by gravitational deflection. This deflection introduces correlations between originally independent Fourier modes of the unlensed CMB [75]. Schematically, one forms weighted pairs of observed multipoles and integrates over $\ell$ to yield [76]
$\begin{eqnarray}{\hat{\phi }}_{XY}(L)={N}_{L}^{XY}\int \displaystyle \frac{{{\rm{d}}}^{2}l}{{(2\pi )}^{2}}X(l)Y(L-l)f(l,L),\end{eqnarray}$
where X, YT, E, B, and the filter fXY depends on theoretical ${C}_{{\ell }}^{{\rm{XY}}}$. The deflection field $\text { ∇ }$Φ mixes the unlensed E- and B-polarizations.
The normalization NL is chosen so that ‹ΦTT› = Φ. In practice one combines all channel pairs into a minimum-variance $\hat{\phi }(L)$. Analogous filters fXY are derived for all pairs; a weighted sum of them yields the minimum-variance estimate $\hat{\phi }(L)$ [75].
We next perform delensing operations on all-sky temperature and polarization maps using the delensalot12 package, and conduct a systematic quantitative analysis to compare the delensing efficacy with that of the UNet++ architecture [74].

3.2. UNet++ structure

UNet++ is an enhanced version of the UNet architecture, incorporating an improved architecture to facilitate the fusion of multiscale features efficiently. The UNet network is a type of CNN that was initially designed for biomedical image segmentation [77]. However, it includes considerable structural modifications based on the CNN. The primary objective is to incorporate a sequence of layers to the standard contracting network, with an up-sampling operation instead of the pooling operation. As a result, the resolution of the output is increased. The expanded path is relatively symmetrical with the contracted half, generating a U-shaped structure [77]. This section introduces the deep neural network architecture UNet++ [40, 42] employed in our CMB delensing analysis. In this paper, we utilize the derivative network of UNet++, as shown in figure 4 [77].
Figure 4. Unet++ network architecture. Each node in the graph represents a convolution block, downward arrows indicate down-sampling, upward arrows indicate up-sampling, dot arrows indicate skip connections, and the dot box indicates the four outputs. UNet++ combines UNets of different depths into a unified architecture. All substructures share the same encoder, but have their own decoders. Then skip connections are dropped, and every two neighboring nodes are connected with a short skip connection, enabling the deeper decoder to send supervisory signals to the shallower decoder. Finally, by connecting the decoders, a densely connected skip connection is generated so that the dense features propagate along the skip connection, resulting in more flexible feature fusion at the decoder nodes. Thus, each node in the UNet++ decoder combines multiscale features of the same resolution from all its preceding nodes from a horizontal perspective, and integrates multiscale features of different resolutions from its preceding nodes from a vertical perspective.This multiscale feature aggregation in UNet++ gradually synthesizes the segmentation, resulting in improved accuracy and fast convergence.
Specifically, the left side of the UNet is the down-sampling (encoder) part, which is used to extract abstract features from the image. By using convolution and down-sampling operations, the image size is reduced to extract shallow features. The convolution operation uses a valid padding method, ensuring that the results are based on the context features without missing information. Therefore, after each convolution, the size of the image will be reduced.
In essence, the task of semantic segmentation entails distinguishing a particular class of images from other image classes through the utilization of segmentation masks. It can also be considered as image classification at a pixel level. Our work aims to classify the sky map of CMB temperature via convolution with the ultimate purpose of delensing CMB. As a result, our network also involves regression operations.
Figure 4 shows a unified UNet++ architecture that merges four UNets of varying depths. These UNets generate four outputs, designated output 1-4. In the graphical abstract, the original UNet appears as yellow, with skip connections depicted by dot arrows, and the four outputs displayed inside a dotted box. At the inference stage, UNet++ can be pruned by selecting a different output if it was trained with deep supervision.
By dropping some skip connections and connecting every two neighboring nodes with a short skip connection, the deeper decoder can send supervisory signals to the shallower decoder, leading to faster training. Furthermore, the decoders are connected, creating a densely connected skip connection, which allows dense features to propagate along the skip connection leading to flexible feature fusion. As a result, each node in the UNet++ decoder combines multiscale features of the same resolution from all preceding nodes horizontally while integrating multiscale features of different resolutions from preceding nodes vertically. This multiscale feature aggregation in UNet++ gradually synthesizes the segmentation, resulting in improved accuracy and fast convergence.

3.3. Loss function

Our goal is to obtain the unlensed CMB from the lensed CMB using a supervised regression algorithm that predicts continuous output values based on input values. Therefore, we analyzed various regression loss functions, including mean average error (MAE, L1 norm), mean squared error (MSE, L2 norm), Huber, and log-cosh. Log-cosh is a logarithmic hyperbolic cosine loss function that computes the logarithm of the hyperbolic cosine of the prediction error. When the actual value ti and the predicted value pi are given, the log-cosh function is defined as,
$\begin{eqnarray}L(p,t)=\displaystyle \sum _{i}\mathrm{log}\cosh ({p}_{i}-{t}_{i}).\end{eqnarray}$
The log-cosh function exhibits qualities similar to those of the MAE for small losses and MSE for large losses, and features second-order differentiability. In contrast, the Huber loss function is not differentiable in all instances. MAE loss represents the average of absolute errors, and the average absolute distance between the expected and predicted data is incapable of addressing significant errors in predictions. MSE loss is the average of squared errors, and emphasizes significant errors, leading to a relatively large impact on the performance indicator. Consequently, we have chosen the log-cosh function for its superior resistance to outliers.

3.4. Training and testing

UNet networks are predicated on fully convolutional networks, which comprise a convolutional network and an inverse convolutional network. Thus, the heart of these networks is the convolutional layer, which involves convolving filters on the input data.
In alignment with the works of Makinen et al and Ni et al [44, 46], we have fixed the number of convolutional kernels at the outset of the input to 32. The kernel size determines the convolution's field of view, which we have established at 3 × 3. To achieve the requisite output dimensionality, we have employed 'same' padding to manage sample boundaries in both convolutions and transpose convolutions. The stride parameter specifies the kernel's traversal steps across the image. In our model, we have maintained the default settings of stride = 1 for convolutions and stride = 2 for transpose convolutions.
The UNet++ architecture was used to train the CMB delensing process via a set of lensed and unlensed CMB sky maps in an end-to-end fashion. Table 1 displays the specifics of the hyperparameters used in this network. The NAdam optimizer was utilized in the analysis with the standard TensorFlow parameters [78, 79].
Table 1. Adjustment and setting of hyperparameters in the UNet++ architecture design. Prior values indicate that the optimum value is selected from the parameters of the preset value.
H-Param Description Prior values Optimum
η Learning rate [10-3, 10-4, 10-5] 10-4
ω Weight decay [10-4, 10-5, 10-6] 10-5
nfilters Filters [16, 32, 64] 32
b Batch size [32, 64, 128] 64
Ω Optimizer [Adam, NAdam] NAdam
The hyperparameters were meticulously fine-tuned for network optimization. The batch size and the initial number of convolutional filters are optimized to 32, 64, respectively, being restricted by GPU memory. The number of epochs is fixed to 3000; table 1 illustrates the learning rate setup. Meanwhile, weight decay was examined with a list of previous values listed in table 1. The optimized values of the initial number of convolution filters, learning rate, weight decay, and batch size were fixed at 32, 10-4, 10-5, and 64, respectively. The total number of parameters is 7.4 × 107, of which 9.04 × 106 are trainable. Since the CMB sky map data contains both positive and negative values due to temperature fluctuations, ReLU and LeakyReLU activation with alpha = 1.0 was chosen to handle this relationship. Figure 5 illustrates the evolution of the loss function during the training of T, E, and B maps as a function of the number of epochs. The figure shows the training results of T, E, and B maps in a top-to-bottom order. The dark blue line indicates the evolution of the training set loss function, while the light blue dashed line represents the validation dataset's loss function development.
Figure 5. Loss function evolution per network over epochs. From top to bottom, the results of training for T, E, and B maps, respectively, are shown. The dark blue solid line indicates the training set loss function evolution, and the light blue solid line indicates the validation set loss function evolution.
In this study, we utilized a high-performance computing platform equipped with eight NVIDIA A40 GPUs, an Intel Xeon Platinum 8358 processor, and 1000 GB of memory, providing robust computational support for the training of complex models. Taking a set of parameter-generated sky maps (e.g., the T field) as an example, a single training run takes approximately 16 hours. For datasets containing all three fields (T, E, and B), after applying 30 different rotational transformations, a total of 90 training tasks were performed, accumulating to 1440 hours of total training time.

4. Results and discussion

In this section, we conduct a thorough analysis of the results generated by the QE algorithm and the UNet++ model. Specifically, we systematically explore the outcomes of these two methods from two key perspectives: sky map patches and power spectra.

4.1. Sky map analysis

Based on the network parameter configuration described in the previous chapter, we have trained multiple datasets and performed prediction operations on the test set. Figure 6 shows the residual performance between the predicted and true values of UNet++ model processing results. Firstly, the first and second rows of the modified figure display a sky map patch from the test set, which includes lensing effects, noise, and instrumental effects, alongside its corresponding label (an unlensed sky map patch). Secondly, the third and fourth rows present the delensing effect visualizations generated by the UNet++ and QE algorithms, respectively. Finally, the fifth and sixth rows illustrate the differences between the delensing results produced by the UNet++ and QE algorithms and their respective labels.
Figure 6. Comparison of training and predicted sky map patches. The image is arranged in columns corresponding to the three sky map patches T, E, and B in order. The columns are arranged from top to bottom, with the first row showing the sky map patch that has been lensed and is affected by noise and instrumental effects, the second row showing the original sky patch of the real sky image that is not affected by gravitational lensing effects, the third row showing the lens-removed prediction result of the UNet++ model after processing, the fourth row showing the lens-removed prediction result of the QE algorithm after processing, the fifth row showing the residual map of the UNet++ prediction result and the real image, and the sixth row showing the residual map of the QE algorithm prediction result and the real image. Similar to figure 3, the size of a sky patch is 214.86 deg2, with the center located at (l, b) = (78.75°, 0°). The unit is μK.
It can be clearly observed from the analysis of figure 6 that the delensing method we used has played a significant role in eliminating the lensing effect. To quantitatively evaluate the results, we used two image-quality evaluation metrics, namely structural similarity (SSIM) and peak signal-to-noise ratio (PSNR).
SSIM is mainly used to analyze the structural information of two images [80, 81]. The closer the SSIM is to 1, the more similar the structural information of the two images is. We have generally processed SSIM by setting the parameters α, β, and $\gamma$ to 1. For the sky patches of two fields ${{ \mathcal F }}_{1}$ and ${{ \mathcal F }}_{2}$, we have carried out the following treatment:
$\begin{eqnarray}\begin{array}{rcl}{\rm{SSIM}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2}) & = & [l{({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})}^{\alpha }\cdot c{({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})}^{\beta }\cdot s{({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})}^{\gamma }]\\ & = & \frac{(2{\bar{{ \mathcal F }}}_{1}{\bar{{ \mathcal F }}}_{2}+{c}_{1})[2{\rm{CoV}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})+{c}_{2}]}{({\bar{{ \mathcal F }}}_{1}{\bar{{ \mathcal F }}}_{2}^{2}+{c}_{1})[\sigma {({{ \mathcal F }}_{1})}^{2}+\sigma {({{ \mathcal F }}_{1})}^{2}+{c}_{2}]},\end{array}\end{eqnarray}$
where $l({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})$, $c({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})$, and $s({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})$ represent the brightness comparison, contrast comparison, and structure comparison between ${{ \mathcal F }}_{1}$ and ${{ \mathcal F }}_{2}$, respectively. ${\rm{CoV}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})$ represents the covariance between ${{ \mathcal F }}_{1}$ and ${{ \mathcal F }}_{2}$, while $\sigma ({{ \mathcal F }}_{1})$ and $\sigma ({{ \mathcal F }}_{2})$ represent the standard deviation of ${{ \mathcal F }}_{1}$ and ${{ \mathcal F }}_{2}$, respectively. c1 and c2 are constants to prevent the numerator or denominator from being zero.
According to equation (17), we give the values of SSIM in table 2. From the table, it can be seen that the SSIM values of the T, E, and B sky map patches after delensing processing using the UNet++ algorithm are close to 1. This means that the image blocks of the sky map after lens processing using the UNet++ algorithm are very similar in structure to those without lens processing.
Table 2. SSIM values between predicted map patches and ground truth labels.
${{\rm{SSIM}}}_{(T,T)}$ ${{\rm{SSIM}}}_{(E,E)}$ ${{\rm{SSIM}}}_{(B,B)}$
Patch(Truth,QE) 0.7667 0.7390 0.0282
Patch(Truth,UNet++) 0.9880 0.9760 0.9544
In contrast, the results of the QE algorithm are much lower than those obtained by the UNet++ algorithm. Especially in the power spectrum of B-mode, this difference is more obvious. This can be explained by the fact that the UNet++ algorithm is better than the QE algorithm in removing lensing effects.
As an image-quality assessment metric, SSIM fully considers the sensitivity of the human eye to structural information and focuses on differences in image structural information, thus being able to more accurately reflect the quality of the image. However, when comprehensively evaluating image quality, we still need to pay attention to pixel-level errors. Therefore, we further calculate the PSNR value between different images, using the following specific formula:
$\begin{eqnarray}PSNR({{ \mathcal F }}_{1}-{{ \mathcal F }}_{2})=10\cdot {\mathrm{log}}_{10}\left[\frac{{\rm{MAX}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})}{{\rm{MSE}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})}\right]{\rm{dB}},\end{eqnarray}$
where ${\rm{MAX}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})$ represents the maximum pixel difference between images ${{ \mathcal F }}_{1}$ and ${{ \mathcal F }}_{2}$, and ${\rm{MSE}}({{ \mathcal F }}_{1},{{ \mathcal F }}_{2})$ represents the mean squared error between ${{ \mathcal F }}_{1}$ and ${{ \mathcal F }}_{2}$. Through this calculation, we can more comprehensively evaluate the degree of distortion at the pixel level in the image, and thus provide strong support for the comprehensive evaluation of image quality.
PSNR stands as a simple yet widely utilized metric for assessing image quality, employing the decibel (dB) as its unit of measurement [82, 83]. Within this evaluative framework, a higher numerical value signifies a lower degree of image distortion, thereby indicating superior image quality. Akin to the computation of the SSIM metric, we have also conducted a corresponding analytical calculation of the PSNR, as shown in table 3.
Table 3. PSNR values between predicted map patches and ground truth labels. The unit is dB.
PSNR(T,T) PSNR(E,E) PSNR(B,B)
Patch(Truth,QE) 21.14 19.25 14.48
Patch(Truth,UNet++) 37.70 38.76 37.87
By analyzing the data in table 3, we can find that the PSNR ratio obtained by the UNet++ algorithm is also significantly higher than that of the QE algorithm. This result shows that the UNet++ algorithm performs better in signal recovery and can restore the true signal more effectively.

4.2. Angular power spectrum analysis

In the previous data simulation stage, in order to study the impact of the lensing effect on the CMB, we conducted 30 simulations for each component. Considering that the segmentation operation of the sky map may introduce errors in scale information, in order to reduce the errors caused by this segmentation, we performed a specific angle rotation on the simulated sky map, followed by a delensing operation, and finally rotated back to the original angle. After such processing, we averaged the results to obtain a complete sky map for subsequent angular power spectrum calculations.
To more intuitively present the impact of rotation processing on the results, we first show the power spectra of each component throughout the day without rotation processing, as shown in figure 7. In this figure, we present the TT, EE, and BB power spectra of the CMB without lensing effect processing. The temperature power spectrum is represented by a dark red solid line, which represents the angular power spectrum generated by the T signal without lens processing. The red dashed line represents the true unlensed power spectrum. The pink dashed line represents the angular power spectrum generated by the T signal without lens processing through the UNet++ network. The pink dotted line represents the angular power spectrum generated by the T signal without lens processing through the QE algorithm. The polarization power spectrum covers the EE and BB spectra, and is represented by dark blue, blue dashed lines, light blue dashed lines and light blue dotted lines, as well as dark green, green dashed lines, light green dashed lines and light green dotted lines in a similar manner to represent the angular power spectra generated by the T signal without lens processing, UNet++ lens processing, and QE algorithm lens processing.
Figure 7. For the TT spectrum, the dark red solid line denotes the lensed power spectrum, the red dashed line shows the unlensed spectrum, the salmon dash-dot line corresponds to the spectrum delensed using UNet++, and the pink dotted line represents delensing via the QE algorithm. For the EE spectrum, the dark green solid line indicates the lensed EE power spectrum, the green dashed line shows the unlensed case, the light green dash-dot line represents delensing with UNet++, and the lime dotted line shows the delensed spectrum obtained by the QE method. For the BB spectrum, the dark blue solid line denotes the lensed BB power spectrum, the blue dashed line represents the unlensed primordial signal, the light blue dash-dot line corresponds to delensing by UNet++, and the cyan dotted line shows the result from QE-based delensing. The parameters for the temperature power spectrum are set to As = 2.1 × 10-9 and r = 0.005.
As can be seen from figure 7, there are significant differences between the delensing effects of the UNet++ and QE algorithms at high $\ell$ values (small scales) and the label power spectrum. This phenomenon reveals the varying performances of the QE and UNet++ network across different scales. Specifically, as the scale decreases, the delensing effect of UNet++ diminishes gradually, and the rate of this decrease accelerates. We propose two possible explanations for this finding.
Firstly, we believe that the lensing effect may gradually increase at smaller scales, causing a significant distortion of the original CMB signal. This distortion may exceed the correction capability of the UNet++ network, leading to a decrease in delensing effectiveness on even smaller scales. Secondly, after dividing the full-sky map into small patches and performing delensing processing on them separately, the algorithms can reconstruct the central regions of each patch relatively accurately, but often fail to fully recover or show slight misalignments at the edges. After stitching the patches together, the tiny discontinuities at the boundaries of each patch correspond to the high $\ell$ components in the angular power spectrum. In contrast, low $\ell$ modes span multiple small patches, and the algorithms tend to retain the average values of each patch. Therefore, even if there are subtle offsets in the overall large-scale structure after stitching, it can still maintain good continuity. To reduce the impact of boundary artifacts on high $\ell$ signals, we attempt to perform rotation processing on the small patches.
It is worth noting that the two algorithms also exhibit opposite deviations in the high $\ell$ range. This is because UNet++, in order to suppress noise amplification, applies stronger smoothing or regularization to strong signals, resulting in a power spectrum recovered at high $\ell$ values that is lower than the label. On the other hand, the QE algorithm is more inclined to preserve all the details, leading to a power spectrum at high $\ell$ values that is higher than the label.
To optimize the delensing results,as previously stated, we initially applied 30 rotational transformations to the acquired full-sky maps. Subsequently, we mitigated the lensing effects present in the rotated data. Finally, we applied another rotational operation to the processed data, culminating in the results depicted in figure 8. Among them, the results obtained through the QE delensing algorithm are directly processed based on the full-sky map. Consequently, no additional rotation and segmentation operations are required for this result. As evident from the figure, the errors stemming from image segmentation have been substantially reduced following the rotational processing.
Figure 8. Angular power spectrum of the unlensed and predicted CMB TT, EE, and BB. The results of UNet++ are obtained by averaging over 30 rotational transformations, while the power spectra of the QE algorithm are calculated based on the full-sky map. The line color and line style in this diagram are consistent with figure 7.
In order to more intuitively compare the effects of delensing on various scales, we introduce two quantities to quantify the results of this process, which are specifically defined as follows:
$\begin{eqnarray}{{\rm{\Gamma }}}_{{\rm{\Lambda }}}^{XX}=\mathrm{Mean}\left(\left|\displaystyle \frac{{C}_{{\ell }}^{X{X}_{\mathrm{unlensed}}}}{{C}_{{\ell }}^{X{X}_{\mathrm{delensed}}}}-1\right|\right),\end{eqnarray}$
where the lower subscript $\Lambda$ represents the QE or the UNet++ algorithm, and the upper subscript XX = {TT, EE, BB}. The closer the value of $\Gamma$ is to zero, the better the delensing effect; conversely, the greater the deviation of $\Gamma$ from zero, the less satisfactory the delensing effect.
Based on the quantitative evaluation metric specified in equation (19), for the TT spectrum, the error value of the QE algorithm is ${{\rm{\Gamma }}}_{Q}^{TT}E=\,0.6451$, while that of the UNet++ algorithm is ${{\rm{\Gamma }}}_{U}^{TT}Net++\,=\,0.0582$. This indicates that the error of the QE algorithm is approximately 10 times that of the UNet++ algorithm. For the EE spectrum, the error of the QE algorithm is ${{\rm{\Gamma }}}_{Q}^{EE}E=\,0.4560$, and the error of the UNet++ algorithm is ${{\rm{\Gamma }}}_{U}^{EE}Net++\,=\,0.0362$. Here, the error of the QE algorithm is roughly 11 times that of the UNet++ algorithm. In the case of the BB spectrum, the disparity in errors between the two algorithms is even more pronounced. The error of the QE algorithm is ${{\rm{\Gamma }}}_{Q}^{BB}E=\,0.9224$, and the error of the UNet++ algorithm is ${{\rm{\Gamma }}}_{U}^{BB}Net++\,=\,0.0693$. At this point, the error of the QE algorithm is approximately 12 times that of the UNet++ algorithm. Based on these comparisons, it is evident that the performance of UNet++ exhibits a significant improvement over the QE algorithm.

4.3. Discussion

The results and analysis presented in the previous section demonstrate that the UNet framework achieves remarkably high performance in delensing. This outcome arises from the deliberate design of our proof-of-concept experiment: the network was trained to map noisy, lensed inputs onto idealized B-mode maps that are both noiseless and unlensed. Consequently, the model is encouraged not only to remove distortions induced by gravitational lensing but also to suppress noise, leading to results that appear exceptionally strong.
A further concern is whether such performance merely reflects memorization of the training labels rather than the learning of physically meaningful patterns. To address this, we enforced strict separation between training, validation, and test datasets, and ensured that the cosmological parameters used to generate the test maps differed from those employed in training. The results indicate that the network performs well when the test parameters lie close to the training distribution, whereas its performance degrades for out-of-distribution cases. This distribution-dependent behavior suggests that the model acts analogously to an interpolation scheme: it achieves strong results within the range of the training data but exhibits limited extrapolation capability. Crucially, this behavior confirms that the network has not simply memorized the training labels, but has instead learned nontrivial mappings associated with the gravitational lensing process, effectively ruling out overfitting.
It should be emphasized that our current experiments are entirely based on simulated data. In the process of simulation, certain complexities inevitably present in real observations are omitted, such as non-Gaussian noise, residual foreground contamination, instrumental systematics, and uncertainties related to cosmological models. Therefore, we do not rule out the possibility that the model's performance may degrade when applied to real observational data. Nevertheless, delensing of the CMB remains a frontier research area. In the absence of large-scale, high-quality observational data, exploring and validating new methods using simulated data is an essential step toward advancing this field.
Looking forward, with the upcoming high-quality observational data from next-generation experiments such as LiteBIRD, CMB-S4, AliCPT, and PIPER, it will become possible to validate and refine these deep learning approaches under conditions closer to real observations. Furthermore, by incorporating more accurate noise modeling, complex foreground components, and instrumental systematics, future studies are expected to systematically evaluate and potentially extend the applicability and robustness of the methods proposed in this work.

5. Conclusions

In CMB observations, weak gravitational lensing distorts CMB images, causing mixing between the T-mode and E-mode, as well as between the E-mode and B-mode. Therefore, finding effective methods to remove the effects of weak gravitational lensing on CMB data is one of the key challenges in revealing the true B-mode polarization signal.
We have adopted a relatively novel solution that allows spherical data to be processed using deep learning methods.
Our approach presents several advantages. It can be applied regardless of the resolution of the full-sky map and requires no complex initialization procedures. The data can be used for model training after simple preprocessing of the sky maps. Segmentation errors are inevitably present in sky maps irrespective of the data type, but our model retains the original data resolution as much as possible through multiple rotation operations.
During the delensing study of full-sky images of CMB T-mode, polarization E-mode, and B-mode components, the UNet++ network demonstrated outstanding performance. We analyzed it at both the image and angular power spectrum levels.
At the image level, we employed two image-quality assessment metrics, SSIM and PSNR. For map patches of T/E/B-modes, the UNet++ algorithm showed values of SSIM very close to 1 compared with the QE algorithm, and PSNR values were also within a highly similar range (greater than 30 dB).
At the power spectrum level, we analyzed the characteristic changes in TT, EE, and BB spectra. By calculating corresponding quantitative indicators, the value of ${{\rm{\Gamma }}}_{U}^{TT}Net++$ was only 0.0582, significantly lower than the delensed ${{\rm{\Gamma }}}_{Q}^{TT}E=\,0.6451$, which fully demonstrates the significant effectiveness of the delensing method. Similarly, there was a marked improvement observed in the EE spectrum: ${{\rm{\Gamma }}}_{U}^{EE}Net++\,=\,0.0362$, while the corresponding ${{\rm{\Gamma }}}_{Q}^{EE}E=\,0.4560$. The difference between lensed and unlensed BB spectra was particularly pronounced; hence, its delensing effect was even more prominent. Specifically, ${{\rm{\Gamma }}}_{U}^{BB}Net++\,=\,0.0693$, whereas ${{\rm{\Gamma }}}_{Q}^{BB}E$ approached 1, further confirming the superior performance of the UNet++ algorithm in this mode. Moreover, by introducing a rotation mechanism, the angular power spectrum improved significantly compared to the non-rotated scenario. However, there is still room for optimizing the delensing effect on small-scale structures.
Furthermore, weak cosmological signals, especially B-mode polarization signals, are particularly susceptible to contamination by foreground radiation. Therefore, in future work, we will consider the impact of foreground components to achieve effective subtraction of CMB foregrounds.

We thank Sebastian Belkner for his valuable suggestions and guidance on the correct use of Lenspyx. This work was supported by the National SKA Program of China (Grants Nos. 2022SKA0110200 and 2022SKA0110203), the National Natural Science Foundation of China (Grants Nos. 12533001, 12575049, 12473001, 11975072, 11835009, and 11875102), the Liaoning Revitalization Talents Program (Grant No. XLYC1905011), the National 111 Project of China (Grant No. B16009), and Science Research Grants from the China Manned Space Project (Grant No. CMS-CSST-2025-A02).

[1]
Smoot G F 1992 Structure in the COBE differential microwave radiometer first year maps Astrophys. J. Lett. 396 L1 L5

DOI

[2]
Fixsen D J, Cheng E S, Gales J M, Mather J C, Shafer R A, Wright E L 1996 The cosmic microwave background spectrum from the full COBE FIRAS data set Astrophys. J. 473 576

DOI

[3]
Bennett C L, Banday A, Gorski K M, Hinshaw G, Jackson P, Keegstra P, Kogut A, Smoot G F, Wilkinson D T, Wright E L 1996 Four-year COBE DMR cosmic microwave background observations: maps and basic results Astrophys. J. Lett. 464 L1 L4

DOI

[4]
Spergel D N 2003 First-year Wilkinson microwave anisotropy probe (WMAP) observations: determination of cosmological parameters Astrophys. J. Suppl. 148 175 194

DOI

[5]
Komatsu E 2009a Five-year Wilkinson microwave anisotropy probe (WMAP) observations: cosmological interpretation Astrophys. J. Suppl. 180 330 376

DOI

[6]
Komatsu E 2009b Five-year Wilkinson microwave anisotropy probe* observations: cosmological interpretation Astrophys. J. Suppl. Ser. 180 330

DOI

[7]
Ade P A R 2014c Planck 2013 results: XVI. Cosmological parameters Astron. Astrophys. 571 A16

DOI

[8]
Ade P A R 2016 Planck 2015 results: XIII. Cosmological parameters Astron. Astrophys. 594 A13

DOI

[9]
Aghanim N 2020 Planck 2018 results: VI. Cosmological parameters Astron. Astrophys. 641 A6

DOI

[10]
Leitch E M, Kovac J M, Pryke C, Reddall B, Sandberg E S, Dragovan M, Carlstrom J E, Halverson N W, Holzapfel W L 2002 Measuring polarization with DASI Nature 420 763 771

DOI

[11]
Readhead A C S 2004 Polarization observations with the cosmic background imager Science 306 836

DOI

[12]
Carlstrom J E 2011 The 10 meter South Pole telescope Publ. Astron. Soc. Pac. 123 568 581

DOI

[13]
Barkats D 2014 Degree-scale CMB polarization measurements from three years of BICEP1 data Astrophys. J. 783 67

DOI

[14]
Ade P A R 2014a BICEP2 II: experiment and three-year data set Astrophys. J. 792 62

DOI

[15]
Staniszewski Z 2012 The Keck array: a multi camera CMB polarimeter at the South Pole J. Low Temp. Phys. 167 827 833

DOI

[16]
Coulton W 2024 Atacama cosmology telescope: high-resolution component-separated maps across one third of the sky Phys. Rev. D 109 063530

DOI

[17]
Li H 2019 Probing primordial gravitational waves: Ali CMB polarization telescope Natl. Sci. Rev. 6 145 154

DOI

[18]
Masi S 2002 The BOOMERanG experiment and the curvature of the universe Prog. Part. Nucl. Phys. 48 243 261

DOI

[19]
Aboobaker A M 2018 The EBEX Balloon-borne experiment——optics, receiver, and polarimetry Astrophys. J. Suppl. 239 7

DOI

[20]
Crill B P 2008 SPIDER: a balloon-borne large-scale CMB polarimeter Proc. SPIE Int. Soc. Opt. Eng. 7010 70102P

DOI

[21]
Allys E 2023 Probing cosmic inflation with the LiteBIRD cosmic microwave background polarization survey PTEP 2023 042F01

DOI

[22]
Abazajian K N 2016 Cmb-s4 science book arXiv:1610.02743

[23]
Abazajian K 2022 Snowmass 2021 cmb-s4 white paper arXiv:2203.08024

[24]
Lazear J 2014 The primordial inflation polarization explorer (PIPER) Proc. SPIE Int. Soc. Opt. Eng. 9153 91531L

DOI

[25]
Lyth D H, Riotto A 1999 Particle physics models of inflation and the cosmological density perturbation Phys. Rept. 314 1 146

DOI

[26]
Daniel B 2011 Inflation Theoretical Advanced Study Institute in Elementary Particle Physics: Physics of the Large and the Small 523 686

DOI

[27]
Ade P A R 2014b Detection of B-mode polarization at degree angular scales by BICEP2 Phys. Rev. Lett. 112 241101

DOI

[28]
Ade P 2019 The Simons observatory: science goals and forecasts J. Cosmol. Astropart. Phys. JCAP02(2019)056

DOI

[29]
Lewis A, Challinor A 2006 Weak gravitational lensing of the CMB Phys. Rept. 429 1 65

DOI

[30]
Okamoto T, Hu W 2003 CMB lensing reconstruction on the full sky Phys. Rev. D 67 083002

DOI

[31]
Bartelmann M, Schneider P 2001 Weak gravitational lensing Phys. Rept. 340 291 472

DOI

[32]
Hu W 2000 Weak lensing of the CMB: a harmonic approach Phys. Rev. D 62 043007

DOI

[33]
Zaldarriaga M, Seljak U 1998 Gravitational lensing effect on cosmic microwave background polarization Phys. Rev. D 58 023003

DOI

[34]
Dodelson S, Rozo E, Stebbins A 2003 Primordial gravity waves and weak lensing Phys. Rev. Lett. 91 021301

DOI

[35]
Lewis A, Challinor A, Lasenby A 2000 Efficient computation of CMB anisotropies in closed FRW models Astrophys. J. 538 473 476

DOI

[36]
Lewis A 2013 Efficient sampling of fast and slow cosmological parameters Phys. Rev. D 87 103529

DOI

[37]
Lewis A 2005 Lensed CMB simulation and parameter estimation Phys. Rev. D 71 083008

DOI

[38]
Gupta A, Zorrilla Matilla J M, Hsu D, Haiman Z 2018 Non-Gaussian information from weak lensing data via deep learning Phys. Rev. D 97 103515

DOI

[39]
Caldeira João, Kimmy Wu W L, Nord B, Avestruz C, Trivedi S, Story K T 2019 DeepCMB: lensing reconstruction of the cosmic microwave background with deep neural networks Astron. Comput. 28 100307

DOI

[40]
Zhou Z, Rahman Siddiquee M M, Tajbakhsh N, Liang J 2018 Unet++: a nested u-net architecture for medical image segmentation Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support : 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, held in conjunction with MICCAI 2018, Granada, Spain, S... 11045 3 11 https://api.semanticscholar.org/CorpusID:50786304

[41]
Yip J H T, Zhang X, Wang Y, Zhang W, Sun Y, Contardo G, Villaescusa-Navarro F, He S, Genel S, Ho S 2019 From dark matter to galaxies with convolutional neural networks 33rd Annual Conference on Neural Information Processing Systems

[42]
Zhou Z, Siddiquee M M R, Tajbakhsh N, Liang J 2020 Unet++: redesigning skip connections to exploit multiscale features in image segmentation IEEE Trans. Med. Imaging 39 1856 1867

DOI

[43]
Springer O M, Ofek E O, Weiss Y, Merten J 2020 Weak lensing shear estimation beyond the shape-noise limit: a machine learning approach Mon. Not. Roy. Astron. Soc. 491 5301 5316

DOI

[44]
Makinen T L, Lancaster L, Villaescusa-Navarro F, Melchior P, Ho S, Perreault-Levasseur L, Spergel D N 2021 deep21: a deep learning method for 21 cm foreground removal J. Cosmol. Astropart. Phys. JCAP04(2021)081

DOI

[45]
Guzman E, Meyers J 2022 Reconstructing cosmic polarization rotation with ResUNet-CMB J. Cosmol. Astropart. Phys. JCAP01(2022)030

DOI

[46]
Ni S, Li Y, Gao L-Y, Zhang X 2022 Eliminating primary beam effect in foreground subtraction of neutral hydrogen intensity mapping survey with deep learning Astrophys. J. 934 83

DOI

[47]
Gao L-Y, Li Y, Ni S, Zhang X 2023 Eliminating polarization leakage effect for neutral hydrogen intensity mapping with deep learning Mon. Not. Roy. Astron. Soc. 525 5278 5290

DOI

[48]
Yan Y-P, Li S-Y, Wang G-J, Zhang Z, Xia J-Q 2024 CMBFSCNN: cosmic microwave background polarization foreground subtraction with a convolutional neural network Astrophys. J. Suppl. 274 4

DOI

[49]
Wang G 2017 Interactive medical image segmentation using deep learning with image-specific fine tuning IEEE Trans. Med. Imaging 37 1562 1573 https://api.semanticscholar.org/CorpusID:563962

DOI

[50]
Ghosh S, Das N, Das I, Maulik U 2019 Understanding deep learning techniques for image segmentation ACM Computing Surveys (CSUR) 52 1 35 https://api.semanticscholar.org/CorpusID:196621428

DOI

[51]
Minaee S, Boykov Y, Porikli F M, Plaza A J, Kehtarnavaz N, Terzopoulos D 2020 Image segmentation using deep learning: a survey IEEE Trans. Pattern Anal. Mach. Intell. 44 3523 3542 https://api.semanticscholar.org/CorpusID:210702798

[52]
Diego-Palazuelos P, Vielva P, Martínez-González E, Barreiro R B 2020 Comparison of delensing methodologies and assessment of the delensing capabilities of future experiments J. Cosmol. Astropart. Phys. JCAP11(2020)058

DOI

[53]
Górski K M, Hivon E, Banday A J, Wandelt B D, Hansen F K, Reinecke M, Bartelman M 2005 HEALPix - a framework for high resolution discretization, and fast analysis of data distributed on the sphere Astrophys. J. 622 759 771

DOI

[54]
Zonca A, Singer L, Lenz D, Reinecke M, Rosset C, Hivon E, Gorski K 2019 healpy: equal area pixelization and spherical harmonics transforms for data on the sphere in Python J. Open Source Softw. 4 1298

DOI

[55]
Lucca M, Schöneberg N, Hooper D C, Lesgourgues J, Chluba J 2020 The synergy between CMB spectral distortions and anisotropies J. Cosmol. Astropart. Phys. JCAP(2020)026

DOI

[56]
Dio E D, Montanari F, Lesgourgues J, Durrer R 2013 The CLASSgal code for relativistic cosmological large scale structure J. Cosmol. Astropart. Phys. JCAP11(2013)044

DOI

[57]
Hassani F, Baghram S, Firouzjahi H 2016 Lensing as a probe of early universe: from CMB to galaxies J. Cosmol. Astropart. Phys. JCAP05(2016)044

DOI

[58]
Zaldarriaga M, Seljak Uroš 1997 All-sky analysis of polarization in the microwave background Phys. Rev. D 55 1830 1840

DOI

[59]
Zaldarriaga M 1998 Cosmic microwave background polarization experiments Astrophys. J. 503 1

DOI

[60]
Rotti A, Huffenberger K 2019 Real-space computation of E/B-mode maps. Part I. Formalism, compact kernels, and polarized filaments J. Cosmol. Astropart. Phys. JCAP01(2019)045

DOI

[61]
Kamionkowski M, Kosowsky A, Stebbins A 1997 Statistics of cosmic microwave background polarization Phys. Rev. D 55 7368 7388

DOI

[62]
Kamionkowski M, Kosowsky A B, Stebbins A 1996 A probe of primordial gravity waves and vorticity Phys. Rev. Lett. 78 2058 2061 https://api.semanticscholar.org/CorpusID:17330375

DOI

[63]
Bunn E F, Zaldarriaga M, Tegmark M, de Oliveira-Costa A 2003 E/B decomposition of finite pixelized CMB maps Phys. Rev. D 67 023501

DOI

[64]
Kamionkowski M, Kovetz E D 2016 The quest for B modes from inflationary gravitational waves Ann. Rev. Astron. Astrophys. 54 227 269

DOI

[65]
Kim J, Naselsky P 2010 E/B decomposition of CMB polarization pattern of incomplete sky: a pixel space approach Astron. Astrophys. 519 A104

DOI

[66]
Wu W L K, Errard J, Dvorkin C, Kuo C L, Lee A T, McDonald P, Slosar A, Zahn O 2014 A guide to designing future ground-based cosmic microwave background experiments Astrophys. J. 788 138

DOI

[67]
Ade P A R 2015 Joint analysis of BICEP2/KeckArray and Planck data Phys. Rev. Lett. 114 101301

DOI

[68]
Wu D, Li H, Ni S, Li Z-W, Liu C-Z 2020 Detecting primordial gravitational waves: a forecast study on optimizing frequency distribution of next generation ground-based CMB telescope Eur. Phys. J. C 80 139

DOI

[69]
Wolz K 2024 The Simons observatory: pipeline comparison and validation for large-scale B-modes Astron. Astrophys. 686 A16

DOI

[70]
Hanany S 2019 PICO: probe of inflation and Cosmic Origins arXiv:1902.105413

[71]
Perraudin Nathanaël, Defferrard Michaël, Kacprzak T, Sgier R 2019 DeepSphere: efficient spherical convolutional neural network with HEALPix sampling for cosmological applications Astron. Comput. 27 130 146

DOI

[72]
Krachmalnicoff N, Tomasi M 2019 Convolutional neural networks on the HEALPix sphere: a pixel-based algorithm and its application to CMB data analysis Astron. Astrophys. 628 A129

DOI

[73]
Schaan E, Ferraro S 2019 Foreground-immune cosmic microwave background lensing with shear-only reconstruction Phys. Rev. Lett. 122 181301

DOI

[74]
Belkner S, Carron J, Legrand L, Umiltà C, Pryke C, Bischoff C 2024 CMB-S4: iterative internal delensing and r constraints Astrophys. J. 964 148

DOI

[75]
Hu W, Okamoto T 2002 Mass reconstruction with CMB polarization Astrophys. J. 574 566 574

DOI

[76]
Böhm V, Schmittfull M, Sherwin B D 2016 Bias to CMB lensing measurements from the bispectrum of large-scale structure Phys. Rev. D 94 043519

DOI

[77]
Ronneberger O, Fischer P, Brox T 2015 U-net: convolutional networks for biomedical image segmentation arXiv:1505.04597 https://api.semanticscholar.org/CorpusID:3719281

[78]
Ruder S 2016 An overview of gradient descent optimization algorithms arXiv:1609.04747

[79]
Reddi S J, Kale S, Kumar S 2018 On the convergence of Adam and beyond arXiv:1904.09237 https://api.semanticscholar.org/CorpusID:3455897

[80]
Wang Z, Bovik A C, Sheikh H R, Simoncelli E P 2004 Image quality assessment: from error visibility to structural similarity IEEE Trans. Image Process. 13 600 612

DOI

[81]
Wang Z, Bovik A C 2009 Mean squared error: love it or leave it? A new look at signal fidelity measures IEEE Signal Process. Mag. 26 98 117

DOI

[82]
Jähne B 2005 Digital Image Processing Springer Science & Business Media

[83]
Gibson J D, Bovik A 2000 Handbook of Image and Video Processing 1st edn Academic Press, Inc

Outlines

/