Exploring Probabilistic Modeling Beyond Domain Generalization for Semantic Segmentation

1National Taiwan University 2University of Washington 3Microsoft 4Chang Gung University
(ICCV 2025)

PDAF is a probabilistic framework for enhancing robustness of semantic segmentation.


Centered Responsive YouTube Video Embed

Abstract

Domain Generalized Semantic Segmentation (DGSS) is a critical yet challenging task, as domain shifts in unseen environments can severely compromise model performance. While recent studies enhance feature alignment by projecting features into the source domain, they often neglect intrinsic latent domain priors, leading to suboptimal results. In this paper, we introduce PDAF, a Probabilistic Diffusion Alignment Framework that enhances the generalization of existing segmentation networks through probabilistic diffusion modeling. PDAF introduces a Latent Domain Prior (LDP) to capture domain shifts and uses this prior as a conditioning factor to align both source and unseen target domains. To achieve this, PDAF integrates into a pre-trained segmentation model and utilizes paired source and pseudo-target images to simulate latent domain shifts, enabling LDP modeling. The framework comprises three modules: the Latent Prior Extractor (LPE) predicts the LDP by supervising domain shifts; the Domain Compensation Module (DCM) adjusts feature representations to mitigate domain shifts; and the Diffusion Prior Estimator (DPE) leverages a diffusion process to estimate the LDP without requiring paired samples. This design enables PDAF to iteratively model domain shifts, progressively refining feature representations to enhance generalization under complex target conditions. Extensive experiments validate the effectiveness of PDAF across diverse and challenging urban scenes.


Architecture of PDAF

Architecture of PDAF.

PDAF augments a pre-trained segmentation network by introducing LDP modeling to enhance its domain generalization. The LPE learns the optimal LDP by modeling cross-domain relationships between source and pseudo-target domains, while DPE is employed to estimate LDP using only target inputs. Finally, the DCM enhances segmentation network with LDP guidance, refining feature alignment and improving domain generalization.

To generalize across domain shifts, we cast DGSS as a probabilistic learning problem. The prediction function of PDAF is formulated as: \[ p_{\theta,\phi}(y_t|x_t) = \int p_\theta(y_t|x_t,z)p_\phi(z|x_t)dz \] The objective function is derived as following: \[ \log p_{\theta, \phi}(y_t | x_t) \geq \mathbb{E}_{q_\varphi(z | x_t, x_s)} \left[ \log p_\theta(y_t | x_t, z) \right] - \mathbb{KL} \left[ q_\varphi(z | x_t, x_s) || p_\phi(z | x_t) \right]. \] Each term in the loss function is modeled and optimized by the following modules.

Proposed Modules

LPE Diagram
DCM Diagram
DPE Diagram

Latent Prior Estimator (LPE) is designed to estimate the optimal LDP by supervising cross-domain feature relationships, providing effective guidance for feature alignment.

Domain Compensation Module (DCM) leverages LDP as a compensation mechanism, providing domain-aware modulation to improve feature alignment.

Diffusion Prior Estimator (DPE) estimate the LDP by probabilistic diffusion modeling, enabling arbitrary target domain and accurate prior estimation for domain alignment.


Quantitative Results

Quantitative results presents a performance comparison of domain generalization methods on two training settings: Cityscapes (real-world) and GTAV (synthetic). Consistent improvements of PDAF across different scenarios and backbones (ResNet-50, Swin-T, Swin-L), highlighting the method's generalization ability and overall effectiveness.

Comparison with existing methods trained on Cityscapes.
Comparison with existing methods trained on GTAV.

Qualitative Results

PDAF effectively enhances the generalization ability of baseline model, preserving detailed structural information—such as road boundaries and object contours—resulting in more precise and consistent segmentation outcomes.


BibTeX

@article{chen2025exploring,
      title={Exploring Probabilistic Modeling Beyond Domain Generalization for Semantic Segmentation},
      author={Chen, I and Chang, Hua-En and Chen, Wei-Ting and Hwang, Jenq-Neng and Kuo, Sy-Yen and others},
      journal={arXiv preprint arXiv:2507.21367},
      year={2025}
    }