Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video
We present Non-Rigid Neural Radiance Fields (NR-NeRF), a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. Our approach takes RGB images of a dynamic scene as input, e.g., from a monocular video recording, and creates a high-quality space-time geometry and appearance representation. In particular, we show that even a single handheld consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views, for example a `bullet-time' video effect. Our method disentangles the dynamic scene into a canonical volume and its deformation. Scene deformation is implemented as ray bending, where straight rays are deformed non-rigidly to represent scene motion. We also propose a novel rigidity regression network that enables us to better constrain rigid regions of the scene, which leads to more stable results. The ray bending and rigidity network are trained without any explicit supervision. In addition to novel view synthesis, our formulation enables dense correspondence estimation across views and time, as well as compelling video editing applications such as motion exaggeration. We demonstrate the effectiveness of our method using extensive evaluations, including ablation studies and comparisons to the state of the art. We urge the reader to watch the supplemental video for qualitative results. Our code will be open sourced.
READ FULL TEXT