PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance Paper • 2406.09326 • Published Jun 13, 2024 • 1
PixelThink: Towards Efficient Chain-of-Pixel Reasoning Paper • 2505.23727 • Published 5 days ago • 3 • 1
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published 10 days ago • 23
PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance Paper • 2406.09326 • Published Jun 13, 2024 • 1
TokenPacker: Efficient Visual Projector for Multimodal LLM Paper • 2407.02392 • Published Jul 2, 2024 • 24
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published 10 days ago • 23
TokenPacker: Efficient Visual Projector for Multimodal LLM Paper • 2407.02392 • Published Jul 2, 2024 • 24