Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 3 days ago • 78
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning Paper • 2505.01441 • Published 11 days ago • 30
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis Paper • 2505.02625 • Published 3 days ago • 16
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks Paper • 2505.00234 • Published 8 days ago • 21
DeepCritic: Deliberate Critique with Large Language Models Paper • 2505.00662 • Published 7 days ago • 48
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think Paper • 2504.20708 • Published 10 days ago • 21
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published 8 days ago • 41
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning Paper • 2504.13941 • Published 23 days ago • 9
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published 12 days ago • 15
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published 20 days ago • 119