<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Paper-Conference | Hao Yan</title><link>https://hyan46.github.io/publication-type/paper-conference/</link><atom:link href="https://hyan46.github.io/publication-type/paper-conference/index.xml" rel="self" type="application/rss+xml"/><description>Paper-Conference</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-US</language><copyright>© 2026 Hao Yan</copyright><lastBuildDate>Thu, 01 Jan 2026 00:00:00 +0000</lastBuildDate><image><url>https://hyan46.github.io/media/icon_hudffdcafa99c609c7e4dfde01dba38f93_35970_512x512_fill_lanczos_center_3.png</url><title>Paper-Conference</title><link>https://hyan46.github.io/publication-type/paper-conference/</link></image><item><title>D-Convexity: A Unified Differentiable Convex Shape Prior via Quasi-Concavity for Data-driven Image Segmentation</title><link>https://hyan46.github.io/chen-dconvexity-cvpr-2026/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/chen-dconvexity-cvpr-2026/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>&lt;strong>D-Convexity&lt;/strong> is a unified, &lt;strong>threshold-free&lt;/strong>, &lt;strong>fully differentiable&lt;/strong> convex-shape prior
for data-driven image segmentation. Instead of constraining the binary mask at a fixed
threshold, we require the &lt;em>entire&lt;/em> network output $u:\Omega\to[0,1]$ to be
&lt;strong>quasi-concave&lt;/strong> — equivalently, &lt;em>every&lt;/em> super-level set
$S_\gamma=\{\mathbf{x}\in\Omega \mid u(\mathbf{x})\geq\gamma\}$
is convex. From this single principle we derive &lt;strong>zero-, first-, and second-order&lt;/strong>
characterizations that turn a hard global geometric constraint into local, differentiable
inequalities, yielding a compact convolutional loss and a drop-in &lt;strong>Convex Gradient
Projection Module (CGPM)&lt;/strong>.&lt;/p>
&lt;p>Accepted at &lt;strong>&lt;a href="https://cvpr.thecvf.com/virtual/2026/poster/39174" target="_blank" rel="noopener">CVPR 2026&lt;/a>&lt;/strong> as a &lt;strong>Highlight paper&lt;/strong> (top 3%).&lt;/p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="D-Convexity architecture: Swin Transformer backbone produces a feature map o, which is passed through a sigmoid to give a raw mask u. The Convex Gradient Projection Module (CGPM) then iteratively projects u onto the quasi-concave manifold using the convex loss gradient, yielding a strictly convex final mask. Training uses cross-entropy on the raw mask and the quasi-concavity loss on the projected mask." srcset="
/chen-dconvexity-cvpr-2026/figures/architecture_hue70167b8aaf56e0966ff3e25d321b857_391144_939aa30de7679cb3f1aee5d71d975f80.webp 400w,
/chen-dconvexity-cvpr-2026/figures/architecture_hue70167b8aaf56e0966ff3e25d321b857_391144_f9a4135ee394c155544ba0e1c5854fa6.webp 760w,
/chen-dconvexity-cvpr-2026/figures/architecture_hue70167b8aaf56e0966ff3e25d321b857_391144_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/chen-dconvexity-cvpr-2026/figures/architecture_hue70167b8aaf56e0966ff3e25d321b857_391144_939aa30de7679cb3f1aee5d71d975f80.webp"
loading="lazy"
style="width: 100%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;/figure>
&lt;p class="has-text-centered" style="max-width:900px;margin:0.5rem auto 1.5rem;font-size:0.95rem;color:#444;">&lt;span class="figure-number">Figure 1:&lt;/span> Overall framework. A Swin-Transformer encoder–decoder backbone produces feature $o$; a sigmoid yields the raw mask $u=\mathcal{S}(o)$. The &lt;strong>Convex Gradient Projection Module (CGPM)&lt;/strong> is an unrolled gradient-descent block ($v^0 \rightarrow v^1 \rightarrow \cdots \rightarrow v^T$) that projects $u$ onto the quasi-concave manifold by descending the convex loss $\nabla\mathcal{L}_{\mathrm{convex}}$. The network is trained with cross-entropy $\mathcal{L}_{\mathrm{CE}}$ on the raw mask and the quasi-concavity loss $\mathcal{L}_{\mathrm{convex}}$ on the projected mask.&lt;/p>
&lt;hr>
&lt;h2 id="animation">Animated Demo: Zero/First/Second-Order Convexification&lt;/h2>
&lt;p>The animation below visualizes the &lt;strong>midpoint (zero-order)&lt;/strong>, &lt;strong>first-order gradient&lt;/strong>, and
&lt;strong>second-order Hessian&lt;/strong> convexification dynamics applied to a non-convex initial mask.
All three orders progressively regularize the shape into a convex region, but with
increasing levels of spatial smoothness.&lt;/p>
&lt;figure class="video-figure" style="margin: 1.5rem auto; text-align: center;">
&lt;div style="display: flex; justify-content: center;">
&lt;video
style="width: 100%; max-width: 980px; height: auto; display: block; border-radius: 6px;"
autoplay
loop
muted
controls
playsinline
preload="metadata"
>
&lt;source src="https://hyan46.github.io/chen-dconvexity-cvpr-2026/figures/combined_all_orders.mp4" type="video/mp4">
Your browser does not support the video tag.
&lt;/video>
&lt;/div>&lt;figcaption style="margin-top: 0.75rem; color: #555; font-size: 0.95rem;">
Convexification dynamics under the proposed zero-, first-, and second-order quasi-concavity priors. Starting from non-convex inputs, the mask function u is iteratively updated by (left) the local midpoint rule (Algorithm 1, zero-order), (middle) the first-order gradient-based supporting-hyperplane condition, and (right) the second-order quadratic-form penalty Q_2(x). Higher-order priors produce progressively smoother convex shapes.
&lt;/figcaption>&lt;/figure>
&lt;hr>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Convexity is a fundamental prior: many anatomical structures (optic disc/cup, blood
vessels, organs) and man-made objects are convex or close-to-convex. Enforcing convexity
suppresses holes, fragmented predictions, and irregular boundary artifacts, especially
under &lt;strong>noise, occlusion, and limited training data&lt;/strong>.&lt;/p>
&lt;p>Existing approaches, however, have significant limitations:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Discrete formulations&lt;/strong> (e.g. 1–0–1 collinear-triplet penalties, graph-cuts with
convexity constraints, ILP/multicut decompositions) rely on combinatorial solvers and
are &lt;strong>hard to differentiate&lt;/strong> through.&lt;/li>
&lt;li>&lt;strong>Level-set/curvature methods&lt;/strong> (non-negative curvature $\kappa\geq 0$,
signed-distance Laplacian $\Delta\phi\geq 0$) certify convexity only at &lt;em>one&lt;/em> chosen
threshold (e.g. $\phi=0$) and are typically &lt;em>necessary but not sufficient&lt;/em>.&lt;/li>
&lt;li>&lt;strong>Recent deep shape priors&lt;/strong> still lack explicit, principled control over convexity
at every confidence level.&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>D-Convexity&lt;/strong> resolves all three issues with a single functional view: the mask
function $u$ itself should be quasi-concave.&lt;/p>
&lt;hr>
&lt;h2 id="theory">Theory: Quasi-Concavity as a Unified Convex Prior&lt;/h2>
&lt;p>We formalize convexity threshold-freely as quasi-concavity of $u$:&lt;/p>
$$
u \text{ is quasi-concave} \;\Longleftrightarrow\; \forall \gamma,\; S_\gamma=\{\mathbf{x}\mid u(\mathbf{x})\geq\gamma\}\ \text{is convex}.
$$
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Left: a concave function lies below its tangent plane everywhere. Right: a quasi-concave function may be steeper than any tangent plane, but every horizontal slice (super-level set) is still a convex region. The gradient at a level-set point x defines the supporting hyperplane (y-x) perpendicular to grad u." srcset="
/chen-dconvexity-cvpr-2026/figures/quasi_concave_hue70167b8aaf56e0966ff3e25d321b857_561392_fce3deffeb3ea82e7cc971b3c405a46e.webp 400w,
/chen-dconvexity-cvpr-2026/figures/quasi_concave_hue70167b8aaf56e0966ff3e25d321b857_561392_f2c2df4170f741ed7a30be6dce060c3f.webp 760w,
/chen-dconvexity-cvpr-2026/figures/quasi_concave_hue70167b8aaf56e0966ff3e25d321b857_561392_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/chen-dconvexity-cvpr-2026/figures/quasi_concave_hue70167b8aaf56e0966ff3e25d321b857_561392_fce3deffeb3ea82e7cc971b3c405a46e.webp"
loading="lazy"
style="width: 80%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;/figure>
&lt;p class="has-text-centered" style="max-width:900px;margin:0.5rem auto 1.5rem;font-size:0.95rem;color:#444;">&lt;span class="figure-number">Figure 2:&lt;/span> &lt;strong>Concave vs. quasi-concave functions.&lt;/strong> A concave function (left) lies below every tangent plane — a &lt;em>strong&lt;/em> property that most segmentation masks violate. A &lt;strong>quasi-concave&lt;/strong> function (right) is the weaker, &lt;em>threshold-free&lt;/em> notion D-Convexity uses: it only requires that every super-level set $S_\gamma$ be a convex region. At any boundary point $\mathbf{x}$, the supporting hyperplane is given by $\nabla u(\mathbf{x})^{\top}(\mathbf{y}-\mathbf{x})=0$ — this is the geometric content of our &lt;strong>first-order condition&lt;/strong>.&lt;/p>
&lt;p>By considering different smoothness assumptions on $u$, we derive three equivalent (or
sufficient) characterizations:&lt;/p>
&lt;h3 id="zero-order">Zero-order condition ($u\in C^0$)&lt;/h3>
&lt;blockquote>
&lt;p>$u$ is quasi-concave $\Longleftrightarrow$ for all $\mathbf{x},\mathbf{y}\in\Omega,\ \lambda\in[0,1]$:
&lt;/p>
$$u(\lambda\mathbf{x}+(1-\lambda)\mathbf{y}) \;\geq\; \min\{u(\mathbf{x}),u(\mathbf{y})\}.$$
&lt;/blockquote>
&lt;p>A line segment joining two points above a level cannot dip below that level.&lt;/p>
&lt;h3 id="first-order">First-order condition ($u\in C^1$)&lt;/h3>
&lt;blockquote>
&lt;p>$u$ is quasi-concave $\Longleftrightarrow$ if $u(\mathbf{x})\geq u(\mathbf{y})$, then
$\nabla u(\mathbf{y})^{\top}(\mathbf{x}-\mathbf{y})\geq 0.$&lt;/p>
&lt;/blockquote>
&lt;p>The gradient at every point defines a &lt;strong>supporting hyperplane&lt;/strong> of the local
super-level set.&lt;/p>
&lt;h3 id="second-order">Second-order condition ($u\in C^2$, sufficient)&lt;/h3>
&lt;blockquote>
&lt;p>If for all $\mathbf{x}\in\Omega$ with $\nabla u(\mathbf{x})\neq 0$ the Hessian
$\nabla^2 u(\mathbf{x}) \prec 0$ (strict negative definite) on the tangent space
$T_\mathbf{x}=\{\mathbf{d}\mid \nabla u(\mathbf{x})^{\top}\mathbf{d}=0\}$,
then $u$ is quasi-concave.&lt;/p>
&lt;/blockquote>
&lt;p>For 2D images this has the &lt;strong>compact convolutional form&lt;/strong>:&lt;/p>
$$
Q_2(\mathbf{x}) \;=\; u_x^2\,u_{yy} \;-\; 2\,u_x u_y\,u_{xy} \;+\; u_y^2\,u_{xx} \;&lt;\;0,
$$
&lt;p>a quadratic form in the image gradient that can be evaluated densely as a tiny
fixed-kernel convolution — no thresholding required.&lt;/p>
&lt;h3 id="unification">A unifying lens&lt;/h3>
&lt;p>Following Section 3.6 of the paper, D-Convexity &lt;strong>recovers many existing convex priors as special cases&lt;/strong>,
with each prior mapped to one of our zero-, first-, or second-order quasi-concavity conditions.
The mapping below uses the &lt;strong>exact references from the CVPR 2026 paper&lt;/strong>
(&lt;a href="https://arxiv.org/abs/2605.19210v1" target="_blank" rel="noopener">arXiv:2605.19210v1&lt;/a>):&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Zero-order line-segment prior.&lt;/strong>
&lt;a href="https://doi.org/10.1109/access.2020.2985095" title="Han, Kwon, Kim &amp;amp; Cho. Noise-Robust Pupil Center Detection Through CNN-Based Segmentation With Shape-Prior Loss. IEEE Access, 2020." target="_blank" rel="noopener">Han, Kwon, Kim &amp;amp; Cho, &lt;em>Noise-Robust Pupil Center Detection with Shape-Prior Loss&lt;/em>, IEEE Access 2020&lt;/a>
require that for every $\mathbf{x},\mathbf{y}$ in the segmentation object, the line segment between them
also lies inside it — this is exactly our &lt;strong>zero-order&lt;/strong> condition (Theorem 1) applied over the
image domain. Our formulation is more general because it applies to the continuous mask $u$ rather
than a single thresholded region.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Half-disk / binary convexity characterization.&lt;/strong>
The indicator-mask condition $(u-1)(b_r\ast(2u-1))\geq 0$ proposed in
&lt;a href="https://arxiv.org/abs/2005.07476" target="_blank" rel="noopener">Liu, Tai &amp;amp; Luo, &lt;em>Convex Shape Prior for Deep Neural Convolution Network based Eye Fundus Images Segmentation&lt;/em>, 2020&lt;/a>,
&lt;a href="https://doi.org/10.1142/S0219530521500238" target="_blank" rel="noopener">Luo, Tai &amp;amp; Wang, &lt;em>A New Binary Representation Method for Shape Convexity&lt;/em>, Analysis &amp;amp; Applications 2022&lt;/a>, and
&lt;a href="https://doi.org/10.1016/j.apm.2023.06.008" target="_blank" rel="noopener">Luo, Chen, Xiao &amp;amp; Tai, &lt;em>A Binary Characterization Method for Shape Convexity&lt;/em>, Applied Mathematical Modelling 2023&lt;/a>
follows directly from our &lt;strong>first-order&lt;/strong> supporting-hyperplane condition (Theorem 2): at a background
pixel $\mathbf{y}$, Lemma 1 forces the foreground into the half-space
$\nabla u(\mathbf{y})^{\top}(\mathbf{x}-\mathbf{y})\geq 0$, which intersected with a radius-$r$ disk
gives $|B_r(\mathbf{y})\cap S|\leq \tfrac{1}{2}|B_r(\mathbf{y})|$.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Curvature priors&lt;/strong> $\kappa\geq 0$.
&lt;a href="https://doi.org/10.1117/12.2006787" title="Ukwatta, Yuan, Qiu, Rajchl &amp;amp; Fenster. Efficient Convex Optimization-Based Curvature Dependent Contour Evolution. SPIE Medical Imaging, 2013." target="_blank" rel="noopener">Ukwatta et al., &lt;em>Efficient Convex Optimization-Based Curvature Dependent Contour Evolution&lt;/em>, SPIE 2013&lt;/a> and
&lt;a href="https://doi.org/10.1109/ICIP.2017.8296678" title="Yang, Shi, Yao &amp;amp; Li. A Level Set Method for Convexity Preserving Segmentation of Cardiac Left Ventricle. ICIP, 2017." target="_blank" rel="noopener">Yang et al., &lt;em>A Level Set Method for Convexity Preserving Segmentation of Cardiac Left Ventricle&lt;/em>, ICIP 2017&lt;/a>
constrain non-negative curvature of level-set boundaries — corresponding to $Q_2(\mathbf{x})\leq 0$, the
&lt;strong>necessary but not sufficient&lt;/strong> weakening of our &lt;strong>second-order&lt;/strong> condition $Q_2(\mathbf{x})&lt;0$.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Signed-distance Laplacian priors&lt;/strong> $\|\nabla\phi\|=1$ with $\Delta\phi\geq 0$.
&lt;a href="https://www.csd.uoc.gr/~hy471/papers/Convex_Shape_Prior_for_Multi-Object_Segmentation_ICCV_2019.pdf" title="Luo, Tai, Huo, Wang &amp;amp; Glowinski. Convex Shape Prior for Multi-Object Segmentation Using a Single Level Set Function. ICCV, 2019." target="_blank" rel="noopener">Luo, Tai, Huo, Wang &amp;amp; Glowinski, &lt;em>Convex Shape Prior for Multi-Object Segmentation&lt;/em>, ICCV 2019&lt;/a> and
&lt;a href="https://doi.org/10.1109/TIP.2020.2998981" title="Yan, Tai, Liu &amp;amp; Huang. Convexity Shape Prior for Level Set-Based Image Segmentation Method. IEEE Transactions on Image Processing, 2020." target="_blank" rel="noopener">Yan, Tai, Liu &amp;amp; Huang, &lt;em>Convexity Shape Prior for Level Set-Based Image Segmentation&lt;/em>, IEEE TIP 2020&lt;/a>
impose non-negativity of the signed-distance Laplacian. With $\phi=-u$, the curvature identity
$\kappa=-Q_2/\|\nabla u\|^3$ shows $\kappa\geq 0 \Leftrightarrow Q_2\leq 0$; D-Convexity&amp;rsquo;s strict
$Q_2&lt;0$ upgrades this into a &lt;em>sufficient&lt;/em> convexity condition while remaining fully differentiable.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Related discrete convexity priors&lt;/strong> (discussed in Section 2 of the paper, and subsumed at the pixel-graph
scale by our zero-order view) include 1–0–1 collinear-triple penalties
(&lt;a href="https://link.springer.com/chapter/10.1007/978-3-319-10602-1_44" title="Gorelick, Veksler, Boykov &amp;amp; Nieuwenhuis. Convexity Shape Prior for Segmentation. ECCV, 2014 (journal version: TPAMI, 2017)." target="_blank" rel="noopener">Gorelick, Veksler, Boykov &amp;amp; Nieuwenhuis, ECCV 2014 / TPAMI 2017&lt;/a>),
multicut / ILP convexity constraints
(&lt;a href="https://doi.org/10.1109/CVPR.2016.49" title="Royer, Richmond, Rother, Andres &amp;amp; Kainmüller. Convexity Shape Constraints for Image Segmentation. CVPR, 2016." target="_blank" rel="noopener">Royer, Richmond, Rother, Andres &amp;amp; Kainmüller, CVPR 2016&lt;/a>), and relaxed star-type families
(&lt;a href="https://doi.org/10.1007/978-3-540-88690-7_34" title="Veksler. Star Shape Prior for Graph-Cut Image Segmentation. ECCV, 2008." target="_blank" rel="noopener">Veksler, ECCV 2008&lt;/a>;
&lt;a href="https://doi.org/10.1109/CVPR.2010.5539890" title="Gulshan, Rother, Criminisi, Blake &amp;amp; Zisserman. Geodesic Star Convexity for Interactive Image Segmentation. CVPR, 2010." target="_blank" rel="noopener">Gulshan et al., CVPR 2010&lt;/a>;
&lt;a href="https://openaccess.thecvf.com/content_cvpr_2016/html/Isack_Hedgehog_Shape_Priors_CVPR_2016_paper.html" title="Isack, Veksler, Sonka &amp;amp; Boykov. Hedgehog Shape Priors for Multi-Object Segmentation. CVPR, 2016." target="_blank" rel="noopener">Isack, Veksler, Sonka &amp;amp; Boykov, CVPR 2016&lt;/a>).&lt;/p>
&lt;p>So a single quasi-concavity principle subsumes discrete, half-disk, level-set, and curvature-based
shape priors in &lt;strong>one continuous, differentiable framework&lt;/strong>, with each prior corresponding to the
smoothness order ($C^0$ / $C^1$ / $C^2$) at which it operates.&lt;/p>
&lt;hr>
&lt;h2 id="cgpm">Loss Functions and CGPM&lt;/h2>
&lt;p>The first- and second-order conditions become &lt;strong>local convolutional losses&lt;/strong>, evaluated
densely over the image without any thresholding:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>First-order loss&lt;/strong> ($\mathcal{L}_{\text{1st}}$): penalize the positive part of the
asymmetric pair inequality $\mathrm{ReLU}\big(\nabla u(\mathbf{y})^{\top}(\mathbf{y}-\mathbf{x})\big)$
over a small $r$-radius neighborhood $\mathbf{x}\in N_{\mathbf{y}}$.&lt;/li>
&lt;li>&lt;strong>Second-order loss&lt;/strong> ($\mathcal{L}_{\text{2nd}}$): penalize the positive part of
$Q_2(\mathbf{x})+\delta$ weighted by $\|\nabla u(\mathbf{x})\|$:&lt;/li>
&lt;/ul>
$$
\mathcal{L}_{\text{2nd}}(u) \;=\; \frac{1}{|\Omega|}\sum_{\mathbf{x}\in\Omega} \|\nabla u(\mathbf{x})\|\cdot \mathrm{ReLU}\big(Q_2(\mathbf{x})+\delta\big).
$$
&lt;p>Both losses cost $\mathcal{O}(r^2|\Omega|)$ for the first-order and $\mathcal{O}(|\Omega|)$
for the second-order condition, are GPU-parallel, and have explicit closed-form gradients
(see Appendix E of the paper).&lt;/p>
&lt;h3 id="convex-gradient-projection-module-cgpm">Convex Gradient Projection Module (CGPM)&lt;/h3>
&lt;p>At inference time, the loss alone may not strictly enforce convexity. The &lt;strong>CGPM&lt;/strong> solves a
small proximal optimization on the network logits:&lt;/p>
$$
u_p \in \arg\min_{v\in[0,1]} \tfrac{1}{2}\|v-u\|^2 + \lambda\cdot \mathcal{L}_{\text{convex}}(v),
$$
&lt;p>with $\mathcal{L}_{\text{convex}}\in\{\mathcal{L}_{\text{1st}},\mathcal{L}_{\text{2nd}}\}$.
Implemented as an &lt;strong>unrolled gradient-descent module&lt;/strong> on the logit space, CGPM is a
drop-in projection layer compatible with any segmentation backbone (U-Net, nnU-Net,
TransUNet, etc.):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">CGPM&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">SegModelWithCGPM&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">model&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">UNet2D&lt;/span>&lt;span class="p">()&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">to&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">device&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">model&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">load_state_dict&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ckpt&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">model&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">eval&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">SegCGPM&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">SegModelWithCGPM&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">model&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">backprop_to_backbone&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="kc">False&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">cgpm_output&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">SegCGPM&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>CGPM can be used in &lt;strong>train mode&lt;/strong> (back-propagated into the backbone) or as a
&lt;strong>post-hoc projection&lt;/strong> (frozen backbone, projection only).&lt;/p>
&lt;hr>
&lt;h2 id="experiments">Experimental Results&lt;/h2>
&lt;p>We evaluate D-Convexity on four segmentation benchmarks spanning cardiac MRI
(&lt;strong>ACDC&lt;/strong>), iris segmentation (&lt;strong>CASIA&lt;/strong>), and retinal optic-disc/cup
segmentation (&lt;strong>REFUGE&lt;/strong>, &lt;strong>RIM-ONE-r3&lt;/strong>). To assess &lt;strong>out-of-distribution
generalization&lt;/strong>, models trained on REFUGE are evaluated &lt;em>directly&lt;/em> on
RIM-ONE-r3 without fine-tuning. Reported metrics are Dice ↑, IoU ↑, and
Hausdorff Distance HD ↓.&lt;/p>
&lt;h3 id="qualitative">Qualitative comparison&lt;/h3>
&lt;figure id="figure-figure-3-qualitative-segmentation-comparison-rows-cardiac-mri-acdc-iris-casia-and-retinal-optic-disccup-refuge--rim-one-r3-columns-a-input-b-ground-truth-ch-six-baselines-i-proposed-d-convexity-color-code--white--true-positive--black--true-negative--red--false-positive--green--false-negative--blue--predicted-boundary-baselines-tend-to-produce-fragmented-holes-green-and-spurious-lobes-red-d-convexity-yields-clean-simply-connected-convex-regions-that-tightly-track-the-ground-truth-boundary">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Qualitative segmentation comparison across cardiac MRI, eye, and retinal fundus images. Each row is one image; columns show (a) image, (b) ground truth, and predictions from (c) U-Net, (d) Swin-Unet, (e) Dcan, (f) Dmtn, (g) ConvMCD, (h) ActiveBoundary, (i) the proposed D-Convexity. Baselines produce fragmented holes (green false-negatives) and spurious lobes (red false-positives), while D-Convexity returns clean, simply-connected, convex regions that closely follow the ground truth boundary." srcset="
/chen-dconvexity-cvpr-2026/figures/qualitative_comparison_hue70167b8aaf56e0966ff3e25d321b857_1038093_a3f1b01383f1a541f0e216d0964d6f45.webp 400w,
/chen-dconvexity-cvpr-2026/figures/qualitative_comparison_hue70167b8aaf56e0966ff3e25d321b857_1038093_057fa53ae35268b3b76f04fb8b9d91a9.webp 760w,
/chen-dconvexity-cvpr-2026/figures/qualitative_comparison_hue70167b8aaf56e0966ff3e25d321b857_1038093_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/chen-dconvexity-cvpr-2026/figures/qualitative_comparison_hue70167b8aaf56e0966ff3e25d321b857_1038093_a3f1b01383f1a541f0e216d0964d6f45.webp"
loading="lazy"
style="width: 100%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 3: &lt;/span>&lt;strong>Qualitative segmentation comparison.&lt;/strong> Rows: cardiac MRI (ACDC), iris (CASIA), and retinal optic-disc/cup (REFUGE &amp;amp; RIM-ONE-r3). Columns: (a) input, (b) ground truth, (c)–(h) six baselines, (i) &lt;strong>Proposed (D-Convexity)&lt;/strong>. Color code: ▢ white = true positive, ■ black = true negative, &lt;span style="color:#d62728;">■&lt;/span> red = false positive, &lt;span style="color:#2ca02c;">■&lt;/span> green = false negative, &lt;span style="color:#0a66c2;">▢&lt;/span> blue = predicted boundary. Baselines tend to produce fragmented holes (green) and spurious lobes (red); D-Convexity yields &lt;strong>clean, simply-connected, convex&lt;/strong> regions that tightly track the ground-truth boundary.
&lt;/figcaption>&lt;/figure>
&lt;h3 id="quantitative">Quantitative results&lt;/h3>
&lt;style>
.dconv-results-wrap { overflow-x: auto; margin: 1.25rem 0; }
table.dconv-results {
width: 100%;
border-collapse: collapse;
font-size: 0.95rem;
font-family: 'Noto Sans', sans-serif;
background: #fff;
}
table.dconv-results th, table.dconv-results td {
padding: 8px 10px;
text-align: center;
border-bottom: 1px solid #e6e6e6;
}
table.dconv-results thead tr.group th {
background: #f5f7fa;
font-weight: 700;
border-bottom: 1px solid #d6d9df;
}
table.dconv-results thead tr.metric th {
background: #fafbfd;
font-weight: 600;
color: #555;
border-bottom: 2px solid #cfd3da;
}
table.dconv-results td.method, table.dconv-results th.method {
text-align: left;
font-weight: 500;
white-space: nowrap;
}
table.dconv-results tr.proposed {
background: #eaf3ff;
font-weight: 700;
}
table.dconv-results tr.proposed td { border-bottom: 1px solid #c9def5; }
table.dconv-results td.best { color: #0a66c2; font-weight: 700; }
table.dconv-results td .sep { color: #aaa; }
table.dconv-results caption {
caption-side: top;
text-align: left;
padding: 0.25rem 0 0.75rem 0;
font-size: 0.95rem;
color: #444;
}
&lt;/style>
&lt;div class="dconv-results-wrap">
&lt;table class="dconv-results">
&lt;caption>&lt;strong>Table 1.&lt;/strong> Performance of baseline and shape-aware methods on the
ACDC, CASIA, REFUGE, and RIM-ONE-r3 datasets. Models trained on REFUGE are evaluated
&lt;em>directly&lt;/em> on RIM-ONE-r3 to assess cross-dataset generalization.
Best values per column are in &lt;span style="color:#0a66c2;font-weight:700;">blue&lt;/span>;
our method (&lt;em>Proposed&lt;/em>) is highlighted.&lt;/caption>
&lt;thead>
&lt;tr class="group">
&lt;th class="method" rowspan="2">Method&lt;/th>
&lt;th colspan="3">ACDC&lt;/th>
&lt;th colspan="3">CASIA&lt;/th>
&lt;th colspan="3">REFUGE&lt;/th>
&lt;th colspan="3">RIM-ONE-r3&lt;/th>
&lt;/tr>
&lt;tr class="metric">
&lt;th>Dice&amp;nbsp;↑&lt;/th>&lt;th>IoU&amp;nbsp;↑&lt;/th>&lt;th>HD&amp;nbsp;↓&lt;/th>
&lt;th>Dice&amp;nbsp;↑&lt;/th>&lt;th>IoU&amp;nbsp;↑&lt;/th>&lt;th>HD&amp;nbsp;↓&lt;/th>
&lt;th>Dice&amp;nbsp;↑&lt;/th>&lt;th>IoU&amp;nbsp;↑&lt;/th>&lt;th>HD&amp;nbsp;↓&lt;/th>
&lt;th>Dice&amp;nbsp;↑&lt;/th>&lt;th>IoU&amp;nbsp;↑&lt;/th>&lt;th>HD&amp;nbsp;↓&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="method">U-Net [28]&lt;/td>
&lt;td>89.52&lt;/td>&lt;td>81.02&lt;/td>&lt;td>28.04&lt;/td>
&lt;td>94.65&lt;/td>&lt;td>89.84&lt;/td>&lt;td>2.549&lt;/td>
&lt;td>84.66&lt;/td>&lt;td>73.71&lt;/td>&lt;td>11.07&lt;/td>
&lt;td>76.48&lt;/td>&lt;td>61.92&lt;/td>&lt;td>20.57&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="method">Swin-Unet [3]&lt;/td>
&lt;td>95.42&lt;/td>&lt;td>91.23&lt;/td>&lt;td>4.965&lt;/td>
&lt;td>94.76&lt;/td>&lt;td>90.05&lt;/td>&lt;td>2.399&lt;/td>
&lt;td>84.00&lt;/td>&lt;td>72.42&lt;/td>&lt;td>7.863&lt;/td>
&lt;td>81.00&lt;/td>&lt;td>68.07&lt;/td>&lt;td>15.32&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="method">DCAN [4]&lt;/td>
&lt;td>93.38&lt;/td>&lt;td>87.59&lt;/td>&lt;td>6.946&lt;/td>
&lt;td>94.90&lt;/td>&lt;td>90.29&lt;/td>&lt;td>2.413&lt;/td>
&lt;td>80.66&lt;/td>&lt;td>67.59&lt;/td>&lt;td>9.379&lt;/td>
&lt;td>76.23&lt;/td>&lt;td>61.59&lt;/td>&lt;td>16.53&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="method">DMTN [31]&lt;/td>
&lt;td>92.60&lt;/td>&lt;td>86.22&lt;/td>&lt;td>8.500&lt;/td>
&lt;td>94.92&lt;/td>&lt;td>90.34&lt;/td>&lt;td>2.337&lt;/td>
&lt;td>82.36&lt;/td>&lt;td>70.01&lt;/td>&lt;td>9.337&lt;/td>
&lt;td>78.39&lt;/td>&lt;td>64.46&lt;/td>&lt;td>16.80&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="method">ConvMCD [25]&lt;/td>
&lt;td>93.44&lt;/td>&lt;td>87.68&lt;/td>&lt;td>15.53&lt;/td>
&lt;td>95.03&lt;/td>&lt;td>90.54&lt;/td>&lt;td>2.323&lt;/td>
&lt;td>78.38&lt;/td>&lt;td>64.45&lt;/td>&lt;td>12.51&lt;/td>
&lt;td>76.71&lt;/td>&lt;td>62.22&lt;/td>&lt;td>18.18&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="method">Active Boundary [35]&lt;/td>
&lt;td>90.93&lt;/td>&lt;td>81.38&lt;/td>&lt;td>24.71&lt;/td>
&lt;td>94.49&lt;/td>&lt;td>89.55&lt;/td>&lt;td>2.656&lt;/td>
&lt;td>84.82&lt;/td>&lt;td>73.63&lt;/td>&lt;td>10.59&lt;/td>
&lt;td>75.37&lt;/td>&lt;td>60.48&lt;/td>&lt;td>20.64&lt;/td>
&lt;/tr>
&lt;tr class="proposed">
&lt;td class="method">Proposed (D-Convexity)&lt;/td>
&lt;td class="best">95.46&lt;/td>&lt;td class="best">91.31&lt;/td>&lt;td class="best">4.702&lt;/td>
&lt;td>94.71&lt;/td>&lt;td>89.94&lt;/td>&lt;td class="best">2.288&lt;/td>
&lt;td class="best">88.61&lt;/td>&lt;td class="best">79.54&lt;/td>&lt;td class="best">5.859&lt;/td>
&lt;td class="best">83.09&lt;/td>&lt;td class="best">71.08&lt;/td>&lt;td class="best">12.59&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;p>&lt;strong>Takeaways.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Best overall on 3 of 4 datasets.&lt;/strong> D-Convexity is the top performer on
ACDC, REFUGE, and RIM-ONE-r3 across all three metrics, and is best on
Hausdorff Distance on CASIA. Dice/IoU on CASIA are essentially saturated
for all methods (within 0.3% of each other).&lt;/li>
&lt;li>&lt;strong>Largest gains on hard, shape-driven tasks.&lt;/strong> On REFUGE, D-Convexity
improves Dice from 84.82 → &lt;strong>88.61&lt;/strong> ( +3.79) and reduces HD from 7.863 →
&lt;strong>5.859&lt;/strong> ( −2.0) versus the strongest baseline, with similar gains on the
ACDC cardiac task.&lt;/li>
&lt;li>&lt;strong>Strong out-of-distribution generalization.&lt;/strong> When the REFUGE-trained
model is applied &lt;em>directly&lt;/em> to RIM-ONE-r3 (different acquisition device
and population), D-Convexity still wins by &lt;strong>+2.1 Dice&lt;/strong> and &lt;strong>−2.7 HD&lt;/strong>
over Swin-Unet — evidence that the convex shape prior acts as a robust,
task-agnostic regularizer rather than overfitting to a particular dataset.&lt;/li>
&lt;li>&lt;strong>Drop-in improvement.&lt;/strong> All gains are obtained with the same backbone
segmentation network as the baselines, with CGPM as a plug-in module — no
architectural changes are required.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="key-ideas">Key Contributions&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Quasi-concavity as a unified convex prior.&lt;/strong> We formalize convexity of &lt;em>all&lt;/em>
super-level sets as quasi-concavity of the network output $u$, yielding a
threshold-free, differentiable, image-domain constraint.&lt;/li>
&lt;li>&lt;strong>Multi-order characterizations.&lt;/strong> Zero-, first-, and second-order conditions for
$u\in C^0,C^1,C^2$, corresponding to different mask smoothness regimes.&lt;/li>
&lt;li>&lt;strong>Compact convolutional losses.&lt;/strong> The first- and second-order conditions reduce to
tiny fixed-kernel convolutions, allowing dense evaluation across the image at
$\mathcal{O}(|\Omega|)$ cost.&lt;/li>
&lt;li>&lt;strong>Convex Gradient Projection Module (CGPM).&lt;/strong> A plug-and-play unrolled-optimization
module that strictly enforces convexity at inference time.&lt;/li>
&lt;li>&lt;strong>Theoretical unification.&lt;/strong> Discrete 1–0–1 priors, half-disk convolution priors, and
curvature / signed-distance Laplacian priors are all recovered as special cases or
necessary weakenings of our framework.&lt;/li>
&lt;li>&lt;strong>Empirical gains.&lt;/strong> Consistent convexity and shape-regularity improvements across
multiple medical-imaging datasets (retinal fundus, cardiac MRI, iris, etc.),
outperforming task-specific networks and prior shape-aware methods.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="quickstart">Quick Start&lt;/h2>
&lt;p>The reference implementation is available on GitHub:
&lt;a href="https://github.com/ShengzheC/D-Convexity" target="_blank" rel="noopener">&lt;strong>ShengzheC/D-Convexity&lt;/strong>&lt;/a>.&lt;/p>
&lt;p>For intuition on the convexification algorithm and the zero-order dynamics, start with
the notebook:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">Convexification_Algorithm.ipynb
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The CGPM segmentation framework lives in &lt;code>CGPM.py&lt;/code>, and the first- and second-order
losses in &lt;code>loss.py&lt;/code>.&lt;/p>
&lt;hr>
&lt;h2 id="resources">Resources&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Paper (arXiv):&lt;/strong> &lt;a href="https://arxiv.org/abs/2605.19210v1" target="_blank" rel="noopener">arXiv:2605.19210&lt;/a>&lt;/li>
&lt;li>&lt;strong>Code:&lt;/strong> &lt;a href="https://github.com/ShengzheC/D-Convexity" target="_blank" rel="noopener">github.com/ShengzheC/D-Convexity&lt;/a>&lt;/li>
&lt;li>&lt;strong>CVPR 2026 virtual poster:&lt;/strong> &lt;a href="https://cvpr.thecvf.com/virtual/2026/poster/39174" target="_blank" rel="noopener">cvpr.thecvf.com/virtual/2026/poster/39174&lt;/a>&lt;/li>
&lt;li>&lt;strong>Venue:&lt;/strong> CVPR 2026 (Highlight, top 3%)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="bibtex">BibTeX&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bibtex" data-lang="bibtex">&lt;span class="line">&lt;span class="cl">&lt;span class="nc">@inproceedings&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="nl">chen2026dconvexity&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">title&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{D-Convexity: A Unified Differentiable Convex Shape Prior via Quasi-Concavity for Data-driven Image Segmentation}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">author&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Chen, Shengzhe and Yan, Hao}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">booktitle&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">year&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{2026}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">note&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Accepted as Highlight (top 3\%)}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">eprint&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{2605.19210}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">archivePrefix&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{arXiv}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">primaryClass&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{cs.CV}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">url&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{https://arxiv.org/abs/2605.19210v1}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>Path-Coupled Bellman Flows for Distributional Reinforcement Learning</title><link>https://hyan46.github.io/xu-path-coupled-icml-2026/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/xu-path-coupled-icml-2026/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>&lt;strong>Path-Coupled Bellman Flows (PCBF)&lt;/strong> is a continuous-time distributional reinforcement
learning method that learns return distributions with &lt;strong>flow matching&lt;/strong> using
&lt;strong>source-consistent Bellman-coupled paths&lt;/strong>: the current path starts from the required base
prior at $t{=}0$, reaches the Bellman target at $t{=}1$, and maintains a pathwise affine
relation to the successor flow at intermediate times. PCBF couples current and successor
return flows through &lt;strong>shared base noise&lt;/strong> and uses a &lt;strong>$\lambda$-parameterized control
variate&lt;/strong> that trades controlled bias for variance reduction in critic training.&lt;/p>
&lt;p>Accepted at &lt;strong>&lt;a href="https://icml.cc" target="_blank" rel="noopener">ICML 2026&lt;/a>&lt;/strong> as a &lt;strong>regular-track presentation&lt;/strong>.&lt;/p>
&lt;figure id="figure-figure-1-path-coupled-bellman-geometry-each-panel-shows-a-single-current-blue-and-successor-orange-return-flow-a-uncoupled-independent-source-noise--flows-are-unrelated-except-in-distribution-b-source-inconsistent-the-successor-starts-from-rgamma-x_0-violating-the-base-prior-at-t0-c-pcbf-shared-noise-drives-both-flows-preserving-the-base-prior-at-t0-and-the-bellman-endpoint-at-t1">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Path-coupled Bellman geometry: uncoupled flows use independent noise; source-inconsistent flows violate the base prior at t=0; PCBF uses shared noise to preserve both the Gaussian source and the Bellman endpoint." srcset="
/xu-path-coupled-icml-2026/figures/comparison_hud14d972c15fc2473c8ae6fc483bd09b9_239790_67af11229e2c97b2751b66d1160f6599.webp 400w,
/xu-path-coupled-icml-2026/figures/comparison_hud14d972c15fc2473c8ae6fc483bd09b9_239790_dafef5a62ceff156ea1c4a126825fd14.webp 760w,
/xu-path-coupled-icml-2026/figures/comparison_hud14d972c15fc2473c8ae6fc483bd09b9_239790_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/comparison_hud14d972c15fc2473c8ae6fc483bd09b9_239790_67af11229e2c97b2751b66d1160f6599.webp"
loading="lazy"
style="width: 100%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 1: &lt;/span>&lt;strong>Path-coupled Bellman geometry.&lt;/strong> Each panel shows a single current (blue) and successor (orange) return flow. &lt;strong>(a)&lt;/strong> Uncoupled: independent source noise — flows are unrelated except in distribution. &lt;strong>(b)&lt;/strong> Source-inconsistent: the successor starts from $R+gamma X_0$, violating the base prior at $t{=}0$. &lt;strong>(c)&lt;/strong> &lt;strong>PCBF:&lt;/strong> shared noise drives both flows, preserving the base prior at $t{=}0$ and the Bellman endpoint at $t{=}1$.
&lt;/figcaption>&lt;/figure>
&lt;hr>
&lt;h2 id="animation">Animated Demo&lt;/h2>
&lt;p>The animation below visualizes learned return transport on the &lt;strong>Discrete Monte Carlo&lt;/strong>
toy environment: particles flow from a Gaussian source at $t{=}0$ to the learned return
distribution at $t{=}1$ along PCBF Bellman-coupled trajectories.&lt;/p>
&lt;figure id="figure-learned-pcbf-return-transport-on-the-discrete-monte-carlo-environment-individual-particles-colored-trajectories-are-transported-from-the-base-noise-distribution-at-t0-to-state-dependent-return-outcomes-at-t1">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="max-width: 980px; width: 100%; ">&lt;img alt="Demonstration of PCBF learned return transport on the Discrete MC environment"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/demo.gif"
loading="lazy"
style="width: 100%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
Learned PCBF return transport on the Discrete Monte Carlo environment. Individual particles (colored trajectories) are transported from the base noise distribution at $t{=}0$ to state-dependent return outcomes at $t{=}1$.
&lt;/figcaption>&lt;/figure>
&lt;hr>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Distributional reinforcement learning (DRL) models the full distribution of returns rather
than only their expectation, enabling richer uncertainty representations and often better
empirical performance. Most practical DRL algorithms, however, rely on &lt;strong>finite-dimensional
approximations&lt;/strong> — categorical projections or quantile assignments — that introduce bias
when the Bellman update does not align with fixed support points.&lt;/p>
&lt;p>Reframing DRL as &lt;strong>continuous probability transport&lt;/strong> makes flow matching a natural
framework: the distributional Bellman equation defines an affine transport relationship,
and a neural velocity field can transport samples from a simple Gaussian prior to the
return law without heuristic projections.&lt;/p>
&lt;p>Directly enforcing an uncorrected pointwise Bellman map inside flow composition fails in
two critical ways:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Source boundary mismatch.&lt;/strong> Flow matching requires generation to start from a fixed
simple prior (e.g., $\mathcal{N}(0,1)$), but an uncorrected Bellman update
$Z_t = R + \gamma Z'_t$ starts from $R + \gamma X_0 \neq X_0$.&lt;/li>
&lt;li>&lt;strong>High-variance bootstrapping.&lt;/strong> When current and successor noises are sampled
independently, intermediate trajectories are not pathwise aligned; Bellman consistency
can only be enforced at the endpoint, yielding unstable per-sample targets.&lt;/li>
&lt;/ul>
&lt;p>PCBF resolves both issues through &lt;strong>source-consistent Bellman path correction&lt;/strong> and
&lt;strong>shared-noise path coupling&lt;/strong>, cleanly separating geometric flow requirements from
Bellman bootstrapping variance.&lt;/p>
&lt;hr>
&lt;h2 id="method">Method: Path-Coupled Bellman Flows&lt;/h2>
&lt;h3 id="shared-noise-paths">Shared-noise Bellman paths&lt;/h3>
&lt;p>Given shared base noise $X_0 \sim \mathcal{N}(0,1)$ and a successor return sample
$X' = \psi_{\theta^-}^{1}(X_0 \mid s', a')$ from the target flow map, PCBF defines
time-synchronized linear interpolation paths:&lt;/p>
$$
Z^{s'}_t = (1-t)X_0 + t X'
\qquad\text{(successor path)},
$$
$$
Z^{s}_t = (1-t)X_0 + t\bigl(R + \gamma X'\bigr)
\qquad\text{(current path)}.
$$
&lt;p>An equivalent form that reveals the Bellman geometry is:&lt;/p>
$$
Z^s_t = t R + \gamma Z^{s'}_t + (1-t)(1-\gamma)X_0.
$$
&lt;p>The residual anchor $(1-t)(1-\gamma)X_0$ guarantees exact alignment at $t{=}0$ regardless
of $\gamma$, while $Z^s_1 = R + \gamma X'$ satisfies the distributional Bellman boundary
at $t{=}1$. Differentiating yields the unbiased BCFM target
$\dot Z^s_t = R + \gamma X' - X_0$.&lt;/p>
&lt;h3 id="lambda-target">Lambda-parameterized control variates&lt;/h3>
&lt;p>To reduce variance from the noisy successor sample $X'$, PCBF forms the training target
$u_t^\lambda$ from two pieces:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Sample Bellman velocity (baseline):&lt;/strong> $Y = R + \gamma X' - X_0$. This is unbiased but
can have high variance because it depends directly on the bootstrapped successor return
$X'$.&lt;/li>
&lt;li>&lt;strong>Control-variate correction:&lt;/strong> $\lambda \cdot \bigl( v_{\theta^-}(t, Z^{s'}_t \mid s', a') - (X' - X_0) \bigr)$,
where $v_{\theta^-}$ is the lagged target velocity field along the successor path
$Z^{s'}_t$.&lt;/li>
&lt;/ul>
&lt;p>Putting them together,&lt;/p>
&lt;p>$u_t^\lambda = Y + \lambda \bigl( v_{\theta^-}(t, Z^{s'}_t \mid s', a') - (X' - X_0) \bigr)$.&lt;/p>
&lt;p>Setting $\lambda = 0$ recovers the unbiased sample Bellman target. Values $\lambda > 0$
introduce a variance-reducing correction using successor-flow velocity predictions. With
shared-noise coupling, the induced bias stays small: in a linear–Gaussian model, shared
noise ($\rho = 1$) gives bias on the order of $(1-\gamma)(1-t)$, which vanishes when
$\gamma \approx 1$ and at the flow endpoints $t \in \{0, 1\}$.&lt;/p>
&lt;h3 id="policy-extraction">Policy extraction for offline RL&lt;/h3>
&lt;p>At deployment, a behavior-cloned proposal policy samples $K{=}16$ candidate actions; each
is scored by the mean terminal return under the learned flow
$\hat Q_\theta(s,a) = \frac{1}{M}\sum_m \psi_\theta^{1}(X_{0,m}\mid s,a)$, and the
highest-scoring action is executed.&lt;/p>
&lt;hr>
&lt;h2 id="toy-environments">Toy Environments: Distributional Fidelity&lt;/h2>
&lt;p>We validate PCBF on three analytically tractable environments with known return laws:
&lt;strong>Solitaire Dice&lt;/strong> (heavy-tailed discrete returns), &lt;strong>Bernoulli MRP&lt;/strong> (uniform return on
$[0,2]$), and &lt;strong>Discrete Monte Carlo Chain&lt;/strong> (multimodal finite-horizon returns).&lt;/p>
&lt;figure id="figure-figure-2-learned-pcbf-maps-on-toy-environments-solitaire-top-left-bernoulli-top-right-discrete-mc-bottom-pcbf-recovers-heavy-tailed-uniform-and-multimodal-return-structures-and-closely-matches-ground-truth-histograms">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Learned PCBF maps on Solitaire, Bernoulli, and Discrete MC toy environments" srcset="
/xu-path-coupled-icml-2026/figures/physics_combined_hufca0a029fbcaa665ca59a9f8c7acda01_1272620_097b50046d8c7f8176e55e305adb21b2.webp 400w,
/xu-path-coupled-icml-2026/figures/physics_combined_hufca0a029fbcaa665ca59a9f8c7acda01_1272620_853ad70ad0e59d0c8b0d1a8e726e0b9c.webp 760w,
/xu-path-coupled-icml-2026/figures/physics_combined_hufca0a029fbcaa665ca59a9f8c7acda01_1272620_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/physics_combined_hufca0a029fbcaa665ca59a9f8c7acda01_1272620_097b50046d8c7f8176e55e305adb21b2.webp"
loading="lazy"
style="width: 90%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 2: &lt;/span>&lt;strong>Learned PCBF maps on toy environments.&lt;/strong> Solitaire (top left), Bernoulli (top right), Discrete MC (bottom). PCBF recovers heavy-tailed, uniform, and multimodal return structures and closely matches ground-truth histograms.
&lt;/figcaption>&lt;/figure>
&lt;figure id="figure-figure-3-distributional-accuracy-on-toy-environments-learned-return-cdfs-for-pcbf-and-value-flows-dcfm-in-0-05-1-versus-ground-truth-references-pcbf-consistently-tracks-the-reference-cdfs-value-flows-degrades-as-dcfm-increases-systematically-underestimating-return-variance">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="CDF comparison of PCBF vs Value Flows on toy environments" srcset="
/xu-path-coupled-icml-2026/figures/toy22_hu0dcfbadcbff05a3ab4013d9c2dd219a9_131428_123c4ba7a42f2001c8d2f13204355c1a.webp 400w,
/xu-path-coupled-icml-2026/figures/toy22_hu0dcfbadcbff05a3ab4013d9c2dd219a9_131428_ecb6f40bb862be0f6c885ae09ee3e7d0.webp 760w,
/xu-path-coupled-icml-2026/figures/toy22_hu0dcfbadcbff05a3ab4013d9c2dd219a9_131428_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/toy22_hu0dcfbadcbff05a3ab4013d9c2dd219a9_131428_123c4ba7a42f2001c8d2f13204355c1a.webp"
loading="lazy"
style="width: 90%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 3: &lt;/span>&lt;strong>Distributional accuracy on toy environments.&lt;/strong> Learned return CDFs for PCBF and Value Flows (dcfm $in {0, 0.5, 1}$) versus ground-truth references. PCBF consistently tracks the reference CDFs; Value Flows degrades as dcfm increases, systematically underestimating return variance.
&lt;/figcaption>&lt;/figure>
&lt;figure id="figure-figure-4-hyperparameter-sensitivity-pcbf-vs-value-flows-on-solitaire-and-discrete-mc-increasing-value-flows-dcfm-coefficient-degrades-wasserstein-error-while-pcbfs-lambda-target-remains-robust-across-a-wide-range-of-values">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Hyperparameter sensitivity of PCBF vs Value Flows on Solitaire and Discrete MC" srcset="
/xu-path-coupled-icml-2026/figures/two_ablation_hu1b966fd1ebe0e14665f7c6108986d77b_137262_5a027b03153b94dd54b96d6a35e57e56.webp 400w,
/xu-path-coupled-icml-2026/figures/two_ablation_hu1b966fd1ebe0e14665f7c6108986d77b_137262_efe33ea10aa36415e9e4a98d88243e49.webp 760w,
/xu-path-coupled-icml-2026/figures/two_ablation_hu1b966fd1ebe0e14665f7c6108986d77b_137262_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/two_ablation_hu1b966fd1ebe0e14665f7c6108986d77b_137262_5a027b03153b94dd54b96d6a35e57e56.webp"
loading="lazy"
style="width: 90%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 4: &lt;/span>&lt;strong>Hyperparameter sensitivity (PCBF vs. Value Flows).&lt;/strong> On Solitaire and Discrete MC, increasing Value Flows&amp;rsquo; dcfm coefficient degrades Wasserstein error, while PCBF&amp;rsquo;s $lambda$-target remains robust across a wide range of values.
&lt;/figcaption>&lt;/figure>
&lt;figure id="figure-figure-5-variance-reduction-via-lambda-parameterized-control-variates-larger-lambda-yields-smoother-bellman-velocity-regression-loss-trajectories-lower-within-run-standard-deviation-validating-the-control-variate-mechanism">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Variance reduction via lambda control variates during training" srcset="
/xu-path-coupled-icml-2026/figures/variance_reduction_hu1a747344d4fac88e0ddace86b41e5b7e_127284_55fa780575e8f5c6d50cd9fd502fc76f.webp 400w,
/xu-path-coupled-icml-2026/figures/variance_reduction_hu1a747344d4fac88e0ddace86b41e5b7e_127284_08d444d5a1287cc5044dd0a02c28aede.webp 760w,
/xu-path-coupled-icml-2026/figures/variance_reduction_hu1a747344d4fac88e0ddace86b41e5b7e_127284_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/variance_reduction_hu1a747344d4fac88e0ddace86b41e5b7e_127284_55fa780575e8f5c6d50cd9fd502fc76f.webp"
loading="lazy"
style="width: 80%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 5: &lt;/span>&lt;strong>Variance reduction via $lambda$-parameterized control variates.&lt;/strong> Larger $lambda$ yields smoother Bellman velocity regression loss trajectories (lower within-run standard deviation), validating the control-variate mechanism.
&lt;/figcaption>&lt;/figure>
&lt;hr>
&lt;h2 id="path-consistency">Pathwise Bellman Residual and Discretization&lt;/h2>
&lt;p>PCBF enforces the Bellman endpoint at $t{=}1$ by construction, but training uses a
finite-step Euler solver (10 NFE). Shared-noise coupling yields smaller &lt;strong>corrected
Bellman residuals&lt;/strong> $r_{\mathrm{corr}}(t,N)$ than independent-noise ablations across
solver budgets $N \in \{4,8,16,32\}$:&lt;/p>
&lt;figure id="figure-figure-6-corrected-bellman-residual-r_mathrmcorrtn-on-solitaire-dice-shared-noise-pcbf-blue-maintains-lower-residuals-than-independent-noise-coupling-orange-across-flow-times-and-euler-budgets">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="Corrected Bellman residual on Solitaire Dice for shared vs independent noise coupling" srcset="
/xu-path-coupled-icml-2026/figures/nfe_hua7d3c173ccc1cb1fc20db9d794571924_101634_5a9ef4c9f72e14fa5f15897607e3b55d.webp 400w,
/xu-path-coupled-icml-2026/figures/nfe_hua7d3c173ccc1cb1fc20db9d794571924_101634_6e9df35176ecf054e979ec0790f83267.webp 760w,
/xu-path-coupled-icml-2026/figures/nfe_hua7d3c173ccc1cb1fc20db9d794571924_101634_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/nfe_hua7d3c173ccc1cb1fc20db9d794571924_101634_5a9ef4c9f72e14fa5f15897607e3b55d.webp"
loading="lazy"
style="width: 80%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 6: &lt;/span>&lt;strong>Corrected Bellman residual&lt;/strong> $r_{mathrm{corr}}(t,N)$ on Solitaire Dice. Shared-noise PCBF (blue) maintains lower residuals than independent-noise coupling (orange) across flow times and Euler budgets.
&lt;/figcaption>&lt;/figure>
&lt;hr>
&lt;h2 id="offline-rl-benchmarks">Offline RL Benchmarks&lt;/h2>
&lt;p>We evaluate PCBF on &lt;strong>38 offline RL tasks&lt;/strong>: 30 OGBench single-task variants (four
state-based manipulation domains and two pixel-based domains) plus eight D4RL Adroit tasks.
Baselines include distributional methods (IQN, CODAC, Value Flows), flow-based scalar
critics (FloQ, FQL), and IQL.&lt;/p>
&lt;figure id="figure-figure-7-ogbench-tasks-state-based-cube-scene-and-puzzle-manipulation-environments-and-pixel-based-visual-variants-used-in-our-evaluation">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" style="width: 100%; ">&lt;img alt="OGBench task illustrations" srcset="
/xu-path-coupled-icml-2026/figures/ogbench_hub9bc7e2659a4678c92a9f8dd67bd8f62_503234_e186c92b9267115b0189a2b2e0111064.webp 400w,
/xu-path-coupled-icml-2026/figures/ogbench_hub9bc7e2659a4678c92a9f8dd67bd8f62_503234_502ececf3d83746f0ce32528258521fa.webp 760w,
/xu-path-coupled-icml-2026/figures/ogbench_hub9bc7e2659a4678c92a9f8dd67bd8f62_503234_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hyan46.github.io/xu-path-coupled-icml-2026/figures/ogbench_hub9bc7e2659a4678c92a9f8dd67bd8f62_503234_e186c92b9267115b0189a2b2e0111064.webp"
loading="lazy"
style="width: 70%; height: auto; display: block;" />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;span class="figure-number">Figure 7: &lt;/span>&lt;strong>OGBench tasks.&lt;/strong> State-based cube, scene, and puzzle manipulation environments and pixel-based visual variants used in our evaluation.
&lt;/figcaption>&lt;/figure>
&lt;h3 id="quantitative">Aggregated results&lt;/h3>
&lt;style>
.pcbf-results-wrap { overflow-x: auto; margin: 1.25rem 0; }
table.pcbf-results {
width: 100%;
border-collapse: collapse;
font-size: 0.92rem;
font-family: 'Noto Sans', sans-serif;
background: #fff;
}
table.pcbf-results th, table.pcbf-results td {
padding: 8px 10px;
text-align: center;
border-bottom: 1px solid #e6e6e6;
}
table.pcbf-results thead tr.group th {
background: #f5f7fa;
font-weight: 700;
border-bottom: 1px solid #d6d9df;
}
table.pcbf-results td.domain, table.pcbf-results th.domain {
text-align: left;
font-weight: 500;
white-space: nowrap;
}
table.pcbf-results tr.proposed {
background: #eaf3ff;
font-weight: 700;
}
table.pcbf-results tr.proposed td { border-bottom: 1px solid #c9def5; }
table.pcbf-results td.best { color: #0a66c2; font-weight: 700; }
table.pcbf-results caption {
caption-side: top;
text-align: left;
padding: 0.25rem 0 0.75rem 0;
font-size: 0.95rem;
color: #444;
}
&lt;/style>
&lt;div class="pcbf-results-wrap">
&lt;table class="pcbf-results">
&lt;caption>&lt;strong>Table 1.&lt;/strong> Offline RL results on OGBench and D4RL Adroit.
Success rates (%) for OGBench domains (5 tasks each) and normalized scores for D4RL.
Results averaged over 8 seeds (4 for pixel tasks). Bold values are within 95% of the
best method on each domain; &lt;em>PCBF (Ours)&lt;/em> is highlighted.&lt;/caption>
&lt;thead>
&lt;tr class="group">
&lt;th class="domain">Domain&lt;/th>
&lt;th>IQN&lt;/th>
&lt;th>CODAC&lt;/th>
&lt;th>FloQ&lt;/th>
&lt;th>FQL&lt;/th>
&lt;th>IQL&lt;/th>
&lt;th>Value Flows&lt;/th>
&lt;th>PCBF (Ours)&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="domain">cube-double-play (5 tasks)&lt;/td>
&lt;td>42 ± 8&lt;/td>&lt;td>61 ± 6&lt;/td>&lt;td>47 ± 14&lt;/td>&lt;td>29 ± 6&lt;/td>&lt;td>7 ± 1&lt;/td>&lt;td>69 ± 4&lt;/td>
&lt;td class="best">71 ± 5&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="domain">scene-play (5 tasks)&lt;/td>
&lt;td>40 ± 1&lt;/td>&lt;td>55 ± 1&lt;/td>&lt;td class="best">58 ± 4&lt;/td>&lt;td>56 ± 2&lt;/td>&lt;td>28 ± 3&lt;/td>&lt;td class="best">59 ± 4&lt;/td>
&lt;td>54 ± 4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="domain">puzzle-4×4-play (5 tasks)&lt;/td>
&lt;td>27 ± 4&lt;/td>&lt;td>20 ± 18&lt;/td>&lt;td>28 ± 6&lt;/td>&lt;td>17 ± 5&lt;/td>&lt;td>7 ± 2&lt;/td>&lt;td>27 ± 4&lt;/td>
&lt;td class="best">30 ± 4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="domain">cube-triple-play (5 tasks)&lt;/td>
&lt;td>6 ± 0&lt;/td>&lt;td>2 ± 1&lt;/td>&lt;td>8 ± 3&lt;/td>&lt;td>4 ± 2&lt;/td>&lt;td>1 ± 1&lt;/td>&lt;td class="best">14 ± 3&lt;/td>
&lt;td>4 ± 1&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="domain">D4RL adroit (8 tasks)&lt;/td>
&lt;td>66 ± 5&lt;/td>&lt;td>69 ± 0&lt;/td>&lt;td>70 ± 5&lt;/td>&lt;td class="best">71 ± 4&lt;/td>&lt;td>70&lt;/td>&lt;td>65 ± 2&lt;/td>
&lt;td class="best">69 ± 2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="domain">visual-antmaze-teleport (5 tasks)&lt;/td>
&lt;td>4 ± 2&lt;/td>&lt;td>—&lt;/td>&lt;td>—&lt;/td>&lt;td>5 ± 2&lt;/td>&lt;td>6 ± 4&lt;/td>&lt;td>13 ± 4&lt;/td>
&lt;td class="best">14 ± 4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="domain">visual-cube-double-play (5 tasks)&lt;/td>
&lt;td>1 ± 0&lt;/td>&lt;td>—&lt;/td>&lt;td>—&lt;/td>&lt;td>6 ± 1&lt;/td>&lt;td>11 ± 6&lt;/td> &lt;td class="best">13 ± 2&lt;/td>
&lt;td>3 ± 0&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;p>&lt;strong>Takeaways.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Selective but strong gains.&lt;/strong> PCBF achieves best or near-best aggregate performance on
&lt;strong>cube-double-play&lt;/strong>, &lt;strong>puzzle-4×4-play&lt;/strong>, &lt;strong>D4RL Adroit&lt;/strong>, and
&lt;strong>visual-antmaze-teleport&lt;/strong>, where critic-side return-law fidelity and variance-controlled
bootstrapping affect action ranking.&lt;/li>
&lt;li>&lt;strong>Best distributional fidelity on toys.&lt;/strong> On analytically tractable MRPs, PCBF closely
tracks ground-truth CDFs and remains robust to $\lambda$, while Value Flows degrades as
the DCFM consistency weight increases.&lt;/li>
&lt;li>&lt;strong>Honest limitations.&lt;/strong> On &lt;strong>cube-triple-play&lt;/strong> and &lt;strong>visual-cube-double-play&lt;/strong>, PCBF
underperforms Value Flows — long-horizon sparse-reward and pixel-based settings remain
challenging when policy extraction, visual encoders, or $\lambda$ selection become
bottlenecks.&lt;/li>
&lt;li>&lt;strong>Similar cost to Value Flows.&lt;/strong> PCBF uses ~60 GB GPU memory and ~2.5× wall-clock versus
scalar critics on OGBench (single A100, $10^6$ steps); training requires 10-step Euler
integration of the velocity field.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="key-ideas">Key Contributions&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Source-consistent Bellman-interpolated paths&lt;/strong> that resolve the $t{=}0$ boundary mismatch
of uncorrected pointwise Bellman paths while preserving the Bellman endpoint at $t{=}1$.&lt;/li>
&lt;li>&lt;strong>Shared-noise path coupling&lt;/strong> that aligns current and successor return flows pathwise,
inducing a geometric Bellman relation between velocity fields.&lt;/li>
&lt;li>&lt;strong>$\lambda$-parameterized control-variate target&lt;/strong> with a distribution-free $L_2$ bias
bound and a linear–Gaussian closed form explaining why shared-noise coupling shrinks
intrinsic bias.&lt;/li>
&lt;li>&lt;strong>Population velocity identification&lt;/strong>, shared-noise Bellman contraction, and Euler
integration sensitivity analysis supporting stable flow-based distributional critics.&lt;/li>
&lt;li>&lt;strong>Comprehensive evaluation&lt;/strong> on Solitaire Dice, Bernoulli, and Discrete MC toy MRPs plus
38 OGBench and D4RL offline RL tasks.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="quickstart">Quick Start&lt;/h2>
&lt;p>The reference implementation is available on GitHub:
&lt;a href="https://github.com/BoyangASU/path-coupled-bellman-flows" target="_blank" rel="noopener">&lt;strong>BoyangASU/path-coupled-bellman-flows&lt;/strong>&lt;/a>.&lt;/p>
&lt;p>PCBF is implemented in JAX, adapted from the FQL codebase. Key hyperparameters: 10 Euler
integration steps, batch size 256, learning rate $3\times10^{-4}$, and domain-tuned
$\lambda$ (see paper Tables for per-domain values). State-based tasks train for 1M
gradient steps; pixel-based tasks for 500K steps.&lt;/p>
&lt;hr>
&lt;h2 id="resources">Resources&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Paper (arXiv):&lt;/strong> &lt;a href="https://arxiv.org/abs/2605.08253" target="_blank" rel="noopener">arXiv:2605.08253&lt;/a>&lt;/li>
&lt;li>&lt;strong>Code:&lt;/strong> &lt;a href="https://github.com/BoyangASU/path-coupled-bellman-flows" target="_blank" rel="noopener">github.com/BoyangASU/path-coupled-bellman-flows&lt;/a>&lt;/li>
&lt;li>&lt;strong>Venue:&lt;/strong> ICML 2026 (regular track)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="bibtex">BibTeX&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bibtex" data-lang="bibtex">&lt;span class="line">&lt;span class="cl">&lt;span class="nc">@inproceedings&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="nl">xu2026pathcoupled&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">title&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Path-Coupled Bellman Flows for Distributional Reinforcement Learning}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">author&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Xu, Boyang and Zou, Qing and Yang, Siqin and Yan, Hao}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">booktitle&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Proceedings of the International Conference on Machine Learning (ICML)}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">year&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{2026}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">note&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Regular track}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">eprint&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{2605.08253}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">archivePrefix&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{arXiv}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">primaryClass&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{cs.LG}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">url&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{https://arxiv.org/abs/2605.08253}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>Multi-modal Generative Modeling of Event Sequences and Time Series for Solar PV Systems</title><link>https://hyan46.github.io/publication/huang-multimodal-case-2025/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/huang-multimodal-case-2025/</guid><description/></item><item><title>Probabilistic Kolmogorov-Arnold Networks via sparsified deep Gaussian processes with additive kernels</title><link>https://hyan46.github.io/publication/zou-probabilistic-kan-case-2025/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/zou-probabilistic-kan-case-2025/</guid><description/></item><item><title>Graph-aware Tensor Topic Models for Individualized Passenger Travel Pattern Clustering</title><link>https://hyan46.github.io/publication/li-graph-tensor-iise-2023/</link><pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/li-graph-tensor-iise-2023/</guid><description/></item><item><title>Tensor dirichlet process multinomial mixture model with graphs for passenger trajectory clustering</title><link>https://hyan46.github.io/publication/li-tensor-dpmm-sigspatial-2023/</link><pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/li-tensor-dpmm-sigspatial-2023/</guid><description/></item><item><title>Attention-based Representation Learning for Time Series with Principal and Residual Space Monitoring</title><link>https://hyan46.github.io/publication/wang-attentionbased-2022/</link><pubDate>Sat, 01 Jan 2022 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/wang-attentionbased-2022/</guid><description/></item><item><title>Event Extraction for aviation accident reports through attention-based multi-label classification</title><link>https://hyan46.github.io/publication/zhao-event-extraction-aiaa-2022/</link><pubDate>Sat, 01 Jan 2022 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/zhao-event-extraction-aiaa-2022/</guid><description/></item><item><title>Combining Anatomical Constraints and Deep Learning for 3-D CBCT Dental Image Multi-Label Segmentation</title><link>https://hyan46.github.io/publication/huang-combining-2021/</link><pubDate>Mon, 19 Apr 2021 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/huang-combining-2021/</guid><description/></item><item><title>Edge Computing Accelerated Defect Classification Based on Deep Convolutional Neural Network With Application in Rolling Image Inspection</title><link>https://hyan46.github.io/publication/huang-edge-2021/</link><pubDate>Fri, 01 Jan 2021 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/huang-edge-2021/</guid><description/></item><item><title>Hierarchical Tree-Based Sequential Event Prediction with Application in the Aviation Accident Report</title><link>https://hyan46.github.io/publication/zhao-hierarchical-2021/</link><pubDate>Fri, 01 Jan 2021 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/zhao-hierarchical-2021/</guid><description/></item><item><title>Tensor Completion for Weakly-Dependent Data on Graph for Metro Passenger Flow Prediction</title><link>https://hyan46.github.io/publication/li-tensor-2020/</link><pubDate>Tue, 01 Dec 2020 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/li-tensor-2020/</guid><description/></item><item><title>Simultaneous material microstructure classification and discovery via hidden Markov modeling of acoustic emission signals</title><link>https://hyan46.github.io/publication/zhao-simultaneous-2021/</link><pubDate>Wed, 01 Jan 2020 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/zhao-simultaneous-2021/</guid><description/></item><item><title>Image-Based Process Monitoring via Adversarial Autoencoder with Applications to Rolling Defect Detection</title><link>https://hyan46.github.io/publication/yan-imagebased-2019/</link><pubDate>Thu, 01 Aug 2019 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/yan-imagebased-2019/</guid><description/></item><item><title>Physics-Based Deep Spatio-Temporal Metamodeling for Cardiac Electrical Conduction Simulation</title><link>https://hyan46.github.io/publication/yan-physicsbased-2019/</link><pubDate>Thu, 01 Aug 2019 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/yan-physicsbased-2019/</guid><description/></item><item><title>Rapid Detection of Hot-Spot by Tensor Decomposition with Application to Weekly Gonorrhea Data</title><link>https://hyan46.github.io/publication/zhao-rapid-2019/</link><pubDate>Tue, 01 Jan 2019 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/zhao-rapid-2019/</guid><description/></item><item><title>Semi-supervised constrained hidden Markov model using multiple sensors for remaining useful life prediction and optimal predictive maintenance— For remaining useful life prediction and optimal predictive maintenance</title><link>https://hyan46.github.io/publication/zhao-semi-supervised-hmm-phm-2019/</link><pubDate>Tue, 01 Jan 2019 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/zhao-semi-supervised-hmm-phm-2019/</guid><description/></item><item><title>Real-time production performance analysis using machine degradation signals— A two-machine case</title><link>https://hyan46.github.io/publication/kang-realtime-2018/</link><pubDate>Mon, 01 Jan 2018 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/kang-realtime-2018/</guid><description/></item><item><title>Point Cloud Data Analysis for Process Modeling and Optimization</title><link>https://hyan46.github.io/publication/pacella-point-cloud-informs-2017/</link><pubDate>Sun, 01 Jan 2017 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/pacella-point-cloud-informs-2017/</guid><description/></item><item><title>Frequency Domain Instantaneous Wavenumber Estimation for Damage Quantification in Layered Plate Structures</title><link>https://hyan46.github.io/publication/mesnil-frequency-2014/</link><pubDate>Wed, 01 Jan 2014 00:00:00 +0000</pubDate><guid>https://hyan46.github.io/publication/mesnil-frequency-2014/</guid><description/></item></channel></rss>