IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs Paper • 2504.15415 • Published 20 days ago • 22
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 103 • 10
Can MLLMs Understand the Deep Implication Behind Chinese Images? Paper • 2410.13854 • Published Oct 17, 2024 • 11