ViSpeak: Visual Instruction Feedback in Streaming Videos Paper • 2503.12769 • Published 6 days ago • 8
Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning Paper • 2503.13360 • Published 6 days ago • 5
Temporal Regularization Makes Your Video Generator Stronger Paper • 2503.15417 • Published 4 days ago • 20
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published 6 days ago • 46