Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper
•
2503.16219
•
Published
•
15
Model weights & datasets in the paper "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t"