NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
Abstract
Surface normal estimation serves as a cornerstone for a spectrum of computer vision applications. While numerous efforts have been devoted to static image scenarios, ensuring temporal coherence in video-based normal estimation remains a formidable challenge. Instead of merely augmenting existing methods with temporal components, we present NormalCrafter to leverage the inherent temporal priors of video diffusion models. To secure high-fidelity normal estimation across sequences, we propose Semantic Feature Regularization (SFR), which aligns diffusion features with semantic cues, encouraging the model to concentrate on the intrinsic semantics of the scene. Moreover, we introduce a two-stage training protocol that leverages both latent and pixel space learning to preserve spatial accuracy while maintaining long temporal context. Extensive evaluations demonstrate the efficacy of our method, showcasing a superior performance in generating temporally consistent normal sequences with intricate details from diverse videos.
Community
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors (2025)
- TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models (2025)
- Can Video Diffusion Model Reconstruct 4D Geometry? (2025)
- EGVD: Event-Guided Video Diffusion Model for Physically Realistic Large-Motion Frame Interpolation (2025)
- GenFusion: Closing the Loop between Reconstruction and Generation via Videos (2025)
- GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation (2025)
- Stereo Any Video: Temporally Consistent Stereo Matching (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper