Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 63
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning Paper • 2502.15425 • Published Feb 21 • 9
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Paper • 2502.14669 • Published Feb 20 • 14
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published Jun 10, 2024 • 30
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots Paper • 2409.10277 • Published Sep 16, 2024 • 1
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs Paper • 2504.17432 • Published 5 days ago • 37