Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.18129

One-RL-to-See-Them-All

One RL to See Them All: Visual Triple Unified Reinforcement Learning. GitHub: https://github.com/MiniMax-AI/One-RL-to-See-Them-All

One-RL-to-See-Them-All/Orsta-7B

Image-Text-to-Text • Updated 1 day ago • 483 • 7
One-RL-to-See-Them-All/Orsta-32B-0321

Image-Text-to-Text • Updated 8 days ago • 12
One-RL-to-See-Them-All/Orsta-32B-0326

Image-Text-to-Text • Updated 1 day ago • 112 • 4
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 11 days ago • 59

One-RL-to-See-Them-All

https://github.com/MiniMax-AI/One-RL-to-See-Them-All

One-RL-to-See-Them-All/Orsta-7B

Image-Text-to-Text • Updated 1 day ago • 483 • 7
One-RL-to-See-Them-All/Orsta-32B-0321

Image-Text-to-Text • Updated 8 days ago • 12
One-RL-to-See-Them-All/Orsta-32B-0326

Image-Text-to-Text • Updated 1 day ago • 112 • 4
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 11 days ago • 59

about 14 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

One-RL-to-See-Them-All

https://github.com/MiniMax-AI/One-RL-to-See-Them-All

One-RL-to-See-Them-All/Orsta-7B

Image-Text-to-Text • Updated 1 day ago • 483 • 7
One-RL-to-See-Them-All/Orsta-32B-0321

Image-Text-to-Text • Updated 8 days ago • 12
One-RL-to-See-Them-All/Orsta-32B-0326

Image-Text-to-Text • Updated 1 day ago • 112 • 4
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 11 days ago • 59

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published 14 days ago • 71
Reward Reasoning Model

Paper • 2505.14674 • Published 14 days ago • 34
Qwen3 Technical Report

Paper • 2505.09388 • Published 20 days ago • 175
AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published 15 days ago • 78

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published 20 days ago • 88
Qwen3 Technical Report

Paper • 2505.09388 • Published 20 days ago • 175
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Paper • 2505.11049 • Published 18 days ago • 59
Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published 14 days ago • 128

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published 29 days ago • 74
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published 11 days ago • 110
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 11 days ago • 75
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 11 days ago • 59

stuff i never have time to read

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Paper • 2402.11984 • Published Feb 19, 2024
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

Paper • 2503.06121 • Published Mar 8 • 5
Timer: Transformers for Time Series Analysis at Scale

Paper • 2402.02368 • Published Feb 4, 2024

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 10
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

Papers + RL/Reasoning

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 128
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 25
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11 • 29
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15 • 17

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs