Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k Paper • 2503.09642 • Published 4 days ago • 14 • 2
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper • 2503.10615 • Published 3 days ago • 14 • 3
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published 3 days ago • 17 • 3
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 24 items • Updated 1 day ago • 42
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 4 days ago • 49 • 3
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 6 days ago • 28 • 3
AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning Paper • 2503.07608 • Published 6 days ago • 19 • 1
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models Paper • 2503.06749 • Published 7 days ago • 22 • 2
Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published 13 days ago • 27
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published 9 days ago • 25 • 2
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 9 days ago • 32 • 3
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 9 days ago • 48 • 2
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published 10 days ago • 14 • 2
Learning from Failures in Multi-Attempt Reinforcement Learning Paper • 2503.04808 • Published 12 days ago • 17 • 2