Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video
Renlong WuZhilu ZhangMingYang ChenXiaopeng FanZiFei YanWangmeng Zuo
Harbin Institute of Technology
[Paper]
[Code]
Visual comparisons for novel-view synthesis. Compared with existing state-of-the-art 4D reconstruction methods, Deblur4DGS produces photo-realistc results from blurry monocular video while maintaing real-time rendering speed.
Abstract
Recent 4D reconstruction methods have yielded impressive results but rely on sharp videos as supervision.
However, motion blur often occurs in videos due to camera shake and object movement, while existing methods render blurry results when using such videos for reconstructing 4D models.
Although a few NeRF-based approaches attempted to address the problem, they struggled to produce high-quality results, due to the inaccuracy in estimating continuous dynamic representations within the exposure time.
Encouraged by recent works in 3D motion trajectory modeling using 3D Gaussian Splatting (3DGS), we suggest taking 3DGS as the scene representation manner, and propose the first 4D Gaussian Splatting framework to reconstruct a high-quality 4D model from blurry monocular video, named Deblur4DGS.
Specifically, we transform continuous dynamic representations estimation within an exposure time into the exposure time estimation.
Moreover, we introduce exposure regularization to avoid trivial solutions, as well as multi-frame and multi-resolution consistency ones to alleviate artifacts.
Furthermore, to better represent objects with large motion, we suggest blur-aware variable canonical Gaussians.
Beyond novel-view synthesis, Deblur4DGS can be applied to improve blurry video from multiple perspectives, including deblurring, frame interpolation, and video stabilization.
Extensive experiments on the above four tasks show that Deblur4DGS outperforms state-of-the-art 4D reconstruction methods.
The codes will be publicly available.
Method
(a) Training of Deblur4DGS.
When processing \(t\)-th frame, we first discretize its exposure time into \(N\) timestamps.
Then, we estimate continuous camera poses \(\{\mathbf{P}_{t,i}\}_{i=1}^{N}\) and dynamic Gaussians \(\{\mathbf{D}_{t,i}\}_{i=1}^{N}\) within exposure time.
Next, we render each latent sharp image \(\hat{\mathbf{I}}_{t,i}\) with the camera pose \(\mathbf{P}_{t,i}\), dynamic Gaussians \(\mathbf{D}_{t,i}\) and static Gaussians \(\mathbf{S}\).
Finally, \(\{\hat{\mathbf{I}}_{t,i}\}_{i=1}^{N}\) are averaged to obtain the synthetic blurry image \(\hat{\mathbf{B}}_{t}\), which is used to calculate the reconstruction loss \(\mathcal{L}_{rec}\) with the given blurry frame \(\mathbf{B}_{t}\).
To regularize the under-constrained optimization, we introduce exposure regularization \(\mathcal{L}_{e}\), multi-frame consistency regularization \(\mathcal{L}_{mfc}\) and multi-resolution consistency regularization \(\mathcal{L}_{mrc}\).
(b) Rendering of Deblur4DGS. Deblur4DGS produces the sharp image with user-provided timestamp \(t\) and camera pose \(\mathbf{P}_{t}\).
Quantitative Comparisons
Quantitative Comparisons on novel-view synthesis, deblurring, frame interpolation, and video stabilization tasks show that Deblur4DGS significantly outperforms state-of-the-art 4D reconstruction methods.
Novel-View Synthesis
Deblur4DGS produces more visually pleasant results in both static and dynamic areas, as marked with yellow and red boxes respectively.
Novel-View Synthesis (Fix Time, Change View)
Deblur4DGS allows users to observe the world at a timestamp from different views.
TimeStamp 1                          TimeStamp 2
TimeStamp 1                          TimeStamp 2
Previous
Next
Novel-View Synthesis (Change Time, Fix View)
Deblur4DGS allows users to observe the world at different timestamps from a view.
View 1                          View 2
View 1                          View 2
Previous
Next
Deblurring
Compared with 4D reconstruction based methods, Deblur4DGS produces sharper contents and fewer visual artifacts, in both static and dynamic areas, as marked with yellow and red boxes respectively.
Frame Interpolation
When feeding the interpolated camera poses and timestamps, Deblur4DGS can produce frame-interpolated results.
Input(Low Frame Rate)                 Deblur4DGS           
Input(Low Frame Rate)                 Deblur4DGS           
Previous
Next
Video Stabilization
With the smoothed camera poses as inputs, Deblur4DGS can render a more stable video.
Input(with Camera Shake)                 Deblur4DGS           
Input(with Camera Shake)                 Deblur4DGS           
Previous
Next
Project page template is borrowed from DreamBooth and StableVITON.