InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 19 days ago • 252
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness Paper • 2503.21755 • Published Mar 27 • 33
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering Paper • 2503.09590 • Published Mar 12 • 3