Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
galois77 's Collections
Multi-language
Agentic
Multimodal
Inference
Check-later
Videos
ahan
Image generation
Training optimization
RL
Reasoning
Benchmarks and challenges
Instructions
Evaluators

RL

updated 5 days ago
Upvote
-

  • Towards General-Purpose Model-Free Reinforcement Learning

    Paper • 2501.16142 • Published Jan 27 • 30

  • RL + Transformer = A General-Purpose Problem Solver

    Paper • 2501.14176 • Published Jan 24 • 28

  • SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

    Paper • 2501.17161 • Published Jan 28 • 121

  • Process-Supervised Reinforcement Learning for Code Generation

    Paper • 2502.01715 • Published Feb 3

  • VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning

    Paper • 2504.06958 • Published Apr 9 • 11

  • ZeroSearch: Incentivize the Search Capability of LLMs without Searching

    Paper • 2505.04588 • Published 6 days ago • 54

  • Improving Editability in Image Generation with Layer-wise Memory

    Paper • 2505.01079 • Published 11 days ago • 27
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs