Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding Paper • 2407.08150 • Published Jul 11, 2024
FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset Paper • 2410.07151 • Published Sep 23, 2024
TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt Paper • 2410.21299 • Published Oct 16, 2024
UniCP: A Unified Caching and Pruning Framework for Efficient Video Generation Paper • 2502.04393 • Published Feb 6
Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion Paper • 2501.05484 • Published Jan 8