LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks Paper • 2506.00411 • Published 3 days ago • 21
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Paper • 2505.13427 • Published 15 days ago • 25
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation Paper • 2502.05415 • Published Feb 8 • 22