Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation Paper • 2505.18842 • Published 10 days ago • 32
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published 4 days ago • 100
Time Blindness: Why Video-Language Models Can't See What Humans Can? Paper • 2505.24867 • Published 4 days ago • 65
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published 4 days ago • 72
Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance Paper • 2505.16348 • Published 12 days ago • 46
BizFinBench: A Business-Driven Real-World Financial Benchmark for Evaluating LLMs Paper • 2505.19457 • Published 9 days ago • 61
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks Paper • 2505.16901 • Published 12 days ago • 20
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents Paper • 2505.21496 • Published 7 days ago • 38
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios Paper • 2505.21333 • Published 7 days ago • 38
Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence Paper • 2505.20325 • Published 11 days ago • 44
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization Paper • 2505.19000 • Published 9 days ago • 41
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond Paper • 2505.19641 • Published 8 days ago • 61
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published 8 days ago • 99
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published 7 days ago • 81
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published 7 days ago • 90
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published 11 days ago • 13
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? Paper • 2505.22129 • Published 6 days ago • 15