Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation Paper • 2505.00612 • Published 15 days ago • 7