Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper • 2505.22334 • Published 6 days ago • 36
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published 6 days ago • 45
MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks Paper • 2505.16459 • Published 12 days ago • 45
MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks Paper • 2505.16459 • Published 12 days ago • 45
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes Paper • 2504.11544 • Published Apr 15 • 42
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? Paper • 2504.06514 • Published Apr 9 • 39
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? Paper • 2504.06514 • Published Apr 9 • 39
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective Paper • 2502.14296 • Published Feb 20 • 46
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 35
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 125
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 125
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published Nov 6, 2024 • 50
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks Paper • 2305.17100 • Published May 26, 2023 • 2
BenTo: Benchmark Task Reduction with In-Context Transferability Paper • 2410.13804 • Published Oct 17, 2024 • 20
BenTo: Benchmark Task Reduction with In-Context Transferability Paper • 2410.13804 • Published Oct 17, 2024 • 20
MLP-KAN: Unifying Deep Representation and Function Learning Paper • 2410.03027 • Published Oct 3, 2024 • 31
MLP-KAN: Unifying Deep Representation and Function Learning Paper • 2410.03027 • Published Oct 3, 2024 • 31
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20, 2024 • 78 • 7