Fostering Video Reasoning via Next-Event Prediction Paper β’ 2505.22457 β’ Published 6 days ago β’ 27
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment Paper β’ 2505.21494 β’ Published 7 days ago β’ 8
BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms Paper β’ 2505.15141 β’ Published 13 days ago β’ 4
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper β’ 2505.16175 β’ Published 13 days ago β’ 39
Optimizing Anytime Reasoning via Budget Relative Policy Optimization Paper β’ 2505.13438 β’ Published 15 days ago β’ 35
π Active PRM Collection Efficient Process Reward Model Training via Active Learning. β’ 4 items β’ Updated Apr 16 β’ 3
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models Paper β’ 2412.18605 β’ Published Dec 24, 2024 β’ 21
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper β’ 2504.13055 β’ Published Apr 17 β’ 19
Efficient Process Reward Model Training via Active Learning Paper β’ 2504.10559 β’ Published Apr 14 β’ 13
Understanding R1-Zero-Like Training: A Critical Perspective Paper β’ 2503.20783 β’ Published Mar 26 β’ 49
βοΈ Sailor Language Models Collection Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab. β’ 17 items β’ Updated Dec 3, 2024 β’ 17
π Scaling Laws with Vocabulary Collection Increase your vocabulary size when you scale up your language model β’ 5 items β’ Updated Aug 11, 2024 β’ 6
𧬠RegMix: Data Mixture as Regression Collection Automatic data mixture method for large language model pre-training ⒠10 items ⒠Updated Jul 26, 2024 ⒠8
π± Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs β’ 34 items β’ Updated Feb 24 β’ 28
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates Paper β’ 2410.07137 β’ Published Oct 9, 2024 β’ 7
π‘ DICE Collection Self-alignment with DPO Implicit Rewards β’ 5 items β’ Updated Jul 28, 2024 β’ 9