Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2, 2024 • 68
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published 20 days ago • 62
Mergenetic: a Simple Evolutionary Model Merging Library Paper • 2505.11427 • Published 18 days ago • 12
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 28 days ago • 168