Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities Paper • 2505.21191 • Published 7 days ago • 2
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 28 days ago • 168
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published 22 days ago • 77
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published Mar 20 • 51