Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published Apr 3 • 45
Running 552 552 Talking Face Generation with Multilingual TTS 👄 Generate a talking face video from text