Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published 10 days ago • 89
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining Paper • 2504.16511 • Published 16 days ago • 20
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning Paper • 2409.12568 • Published Sep 19, 2024 • 51
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding Paper • 2403.01487 • Published Mar 3, 2024 • 16