HoliTom: Holistic Token Merging for Fast Video Large Language Models Paper • 2505.21334 • Published 7 days ago • 18
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published 10 days ago • 23
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Paper • 2503.16257 • Published Mar 20 • 24
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published Mar 20 • 74
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View Paper • 2503.12553 • Published Mar 16 • 7
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds Paper • 2306.00980 • Published Jun 1, 2023 • 15