Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Videos

AAAI 2026

Renlong WuZhilu ZhangMingYang ChenZiFei YanWangmeng Zuo
Harbin Institute of Technology

[Paper]      [Code]     



Visual comparisons for novel-view synthesis. Compared with existing state-of-the-art 4D reconstruction methods, Deblur4DGS produces photo-realistc results from blurry monocular video while maintaing real-time rendering speed.

Abstract

Recent 4D reconstruction methods have yielded impressive results but rely on sharp videos as supervision. However, motion blur often occurs in videos due to camera shake and object movement, while existing methods render blurry results when using such videos for reconstructing 4D models. Although a few approaches attempted to address the problem, they struggled to produce high-quality results, due to the inaccuracy in estimating continuous dynamic representations within the exposure time. Encouraged by recent works in 3D motion trajectory modeling using 3D Gaussian Splatting (3DGS), we suggest taking 3DGS as the scene representation manner, and propose Deblur4DGS to reconstruct a high-quality 4D model from blurry monocular video. Specifically, we transform continuous dynamic representations estimation within an exposure time into the exposure time estimation. Moreover, we introduce the exposure regularization term, multi-frame, and multi-resolution consistency regularization term to avoid trivial solutions. Furthermore, to better represent objects with large motion, we suggest blur-aware variable canonical Gaussians. Beyond novel-view synthesis, Deblur4DGS can be applied to improve blurry video from multiple perspectives, including deblurring, frame interpolation, and video stabilization. Extensive experiments in both synthetic and real-world data on the above four tasks show that Deblur4DGS outperforms state-of-the-art 4D reconstruction methods.

Method


(a) Training of Deblur4DGS. When processing \(t\)-th frame, we first discretize its exposure time into \(N\) timestamps. Then, we estimate continuous camera poses \(\{\mathbf{P}_{t,i}\}_{i=1}^{N}\) and dynamic Gaussians \(\{\mathbf{D}_{t,i}\}_{i=1}^{N}\) within exposure time. Next, we render each latent sharp image \(\hat{\mathbf{I}}_{t,i}\) with the camera pose \(\mathbf{P}_{t,i}\), dynamic Gaussians \(\mathbf{D}_{t,i}\) and static Gaussians \(\mathbf{S}\). Finally, \(\{\hat{\mathbf{I}}_{t,i}\}_{i=1}^{N}\) are averaged to obtain the synthetic blurry image \(\hat{\mathbf{B}}_{t}\), which is used to calculate the reconstruction loss \(\mathcal{L}_{rec}\) with the given blurry frame \(\mathbf{B}_{t}\). To regularize the under-constrained optimization, we introduce exposure regularization \(\mathcal{L}_{e}\), multi-frame consistency regularization \(\mathcal{L}_{mfc}\) and multi-resolution consistency regularization \(\mathcal{L}_{mrc}\). (b) Rendering of Deblur4DGS. Deblur4DGS produces the sharp image with user-provided timestamp \(t\) and camera pose \(\mathbf{P}_{t}\).


Novel-View Synthesis

Deblur4DGS produces more visually pleasant results in both static and dynamic areas, as marked with yellow and red boxes respectively.

Novel-View Synthesis (Fix Time, Change View)

Deblur4DGS allows users to observe the world at a timestamp from different views.

Novel-View Synthesis (Change Time, Fix View)

Deblur4DGS allows users to observe the world at different timestamps from a view.

Deblurring

Compared with 4D reconstruction based methods, Deblur4DGS produces sharper contents and fewer visual artifacts, in both static and dynamic areas, as marked with yellow and red boxes respectively.

Frame Interpolation

When feeding the interpolated camera poses and timestamps, Deblur4DGS can produce frame-interpolated results.

Video Stabilization

With the smoothed camera poses as inputs, Deblur4DGS can render a more stable video.

Project page template is borrowed from DreamBooth and StableVITON.