Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video

Renlong WuZhilu ZhangMingYang ChenXiaopeng FanZiFei YanWangmeng Zuo

Harbin Institute of Technology

[Paper]      [Code]     



Visual comparisons for novel-view synthesis. Compared with existing state-of-the-art 4D reconstruction methods, Deblur4DGS produces photo-realistc results from blurry monocular video while maintaing real-time rendering speed.

Abstract

Recent 4D reconstruction methods have yielded impressive results but rely on sharp videos as supervision. However, motion blur often occurs in videos due to camera shake and object movement, while existing methods render blurry results when using such videos for reconstructing 4D models. Although a few NeRF-based approaches attempted to address the problem, they struggled to produce high-quality results, due to the inaccuracy in estimating continuous dynamic representations within the exposure time. Encouraged by recent works in 3D motion trajectory modeling using 3D Gaussian Splatting (3DGS), we suggest taking 3DGS as the scene representation manner, and propose the first 4D Gaussian Splatting framework to reconstruct a high-quality 4D model from blurry monocular video, named Deblur4DGS. Specifically, we transform continuous dynamic representations estimation within an exposure time into the exposure time estimation. Moreover, we introduce exposure regularization to avoid trivial solutions, as well as multi-frame and multi-resolution consistency ones to alleviate artifacts. Furthermore, to better represent objects with large motion, we suggest blur-aware variable canonical Gaussians. Beyond novel-view synthesis, Deblur4DGS can be applied to improve blurry video from multiple perspectives, including deblurring, frame interpolation, and video stabilization. Extensive experiments on the above four tasks show that Deblur4DGS outperforms state-of-the-art 4D reconstruction methods. The codes will be publicly available.

Method


(a) Training of Deblur4DGS. When processing \(t\)-th frame, we first discretize its exposure time into \(N\) timestamps. Then, we estimate continuous camera poses \(\{\mathbf{P}_{t,i}\}_{i=1}^{N}\) and dynamic Gaussians \(\{\mathbf{D}_{t,i}\}_{i=1}^{N}\) within exposure time. Next, we render each latent sharp image \(\hat{\mathbf{I}}_{t,i}\) with the camera pose \(\mathbf{P}_{t,i}\), dynamic Gaussians \(\mathbf{D}_{t,i}\) and static Gaussians \(\mathbf{S}\). Finally, \(\{\hat{\mathbf{I}}_{t,i}\}_{i=1}^{N}\) are averaged to obtain the synthetic blurry image \(\hat{\mathbf{B}}_{t}\), which is used to calculate the reconstruction loss \(\mathcal{L}_{rec}\) with the given blurry frame \(\mathbf{B}_{t}\). To regularize the under-constrained optimization, we introduce exposure regularization \(\mathcal{L}_{e}\), multi-frame consistency regularization \(\mathcal{L}_{mfc}\) and multi-resolution consistency regularization \(\mathcal{L}_{mrc}\). (b) Rendering of Deblur4DGS. Deblur4DGS produces the sharp image with user-provided timestamp \(t\) and camera pose \(\mathbf{P}_{t}\).


Quantitative Comparisons

Quantitative Comparisons on novel-view synthesis, deblurring, frame interpolation, and video stabilization tasks show that Deblur4DGS significantly outperforms state-of-the-art 4D reconstruction methods.

Novel-View Synthesis

Deblur4DGS produces more visually pleasant results in both static and dynamic areas, as marked with yellow and red boxes respectively.

Novel-View Synthesis (Fix Time, Change View)

Deblur4DGS allows users to observe the world at a timestamp from different views.

Novel-View Synthesis (Change Time, Fix View)

Deblur4DGS allows users to observe the world at different timestamps from a view.

Deblurring

Compared with 4D reconstruction based methods, Deblur4DGS produces sharper contents and fewer visual artifacts, in both static and dynamic areas, as marked with yellow and red boxes respectively.

Frame Interpolation

When feeding the interpolated camera poses and timestamps, Deblur4DGS can produce frame-interpolated results.

Video Stabilization

With the smoothed camera poses as inputs, Deblur4DGS can render a more stable video.

Project page template is borrowed from DreamBooth and StableVITON.