open-r1 (Open R1)

lewtun

updated a Space 13 minutes ago

68

Large Reasoning Models Leaderboard

🐳

A leaderboard to rank large reasoning models

burtenshaw

posted an update 6 days ago

Post

2110

The rebooted LLM course starts today with an overhauled chapter 1 on Transformers:

👉 Follow the org to join the course:

huggingface-course

We’re starting from the foundations of modern generative AI by looking at transformers. This chapter is expanded in depth and features so contains new material like:

FREE and CERTIFIED exam on fundamentals of transformers
deeper exploration of transformer architectures and attention mechanisms
end -to-end exploration of inference strategies for prefill and decode steps

The course has leveled up in complexity and depth, so this a great time to join in if you want to build you own AI models.

fdaudens

posted an update 6 days ago

Post

2045

@thomwolf and @m-ric teaming up as the perfect instructor duo for DeepLearning.ai’s new course: Building Code Agents with Hugging Face smolagents!

https://www.deeplearning.ai/short-courses/building-code-agents-with-hugging-face-smolagents/

davanstrien

posted an update 7 days ago

Post

1856

Came across a very nice submission from @marcodsn for the reasoning datasets competition (https://huggingface.co/blog/bespokelabs/reasoning-datasets-competition).

The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:

- Extracts both the logical structure AND researcher intuition from academic papers
- Adopts the persona of researchers "before experiments" to capture exploratory thinking
- Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model

It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.

I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.

Dataset can be found here: marcodsn/academic-chains (give it a like!)

m-ric

posted an update 12 days ago

Post

2418

New king of open VLMs: InternVL3 takes Qwen 2.5's crown! 👑

InternVL have been a wildly successful series of model : and the latest iteration has just taken back their crown thanks to their superior, natively multimodal vision training pipeline.

➡️ Most of the vision language models (VLMs) these days are built like Frankenstein : take a good text-only Large Language Model (LLM) backbone, stitch a specific vision transformer (ViT) on top of it. Then the training is sequential 🔢 : 1. Freeze the LLM weights while you train the ViT only to work with the LLM part, then 2. Unfreeze all weights to train all weights in order to work together.

💫 The Shanghai Lab decided to challenge this paradigm and chose this approach that they call "native". For each of their model sizes, they still start from a good LLM (mostly Qwen-2.5 series, did I tell you I'm a huge fan of Qwen? ❤️), and stitch the ViT, but they don't freeze anything : they train all weights together with interleaved text and image understanding data in a single pre-training phase 🎨.

They claim it results in more seamless interactions between modalities. And the results prove them right: they took the crown of top VLMs, at nearly all sizes, from their Qwen-2.5 parents. 👑

2 replies

·

burtenshaw

posted an update 13 days ago

Post

1792

Hacked my presentation building with inference providers, Cohere command a, and sheer simplicity. Use this script if you’re burning too much time on presentations:

🔗 https://github.com/burtenshaw/course_generator/blob/main/scripts/create_presentation.py

This is what it does:
- uses command a to generates slides and speaker notes based on some material.
- it renders the material in remark open format and imports all images, tables, etc
- you can then review the slides as markdown and iterate
- export to either pdf or pptx using backslide

🚀 Next steps are: add text to speech for the audio and generate a video. This should make Hugging Face educational content scale to a billion AI Learners.

1 reply

·

yjernite

posted an update 13 days ago

Post

3133

Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on 🤗

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data 📚🔍

That requires actually reading the code though, which isn't always easy or quick! Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen/Qwen2.5-Coder-32B-Instruct to generate reports and it works pretty OK 🙌

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

It comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints 🤗

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: yjernite/space-privacy
- reach out to collab: yjernite/space-privacy

fdaudens

posted an update 14 days ago

Post

1530

Just tested something this morning that feels kind of game-changing for how we publish, discover, and consume news with AI: connecting Claude directly to the New York Times through MCP.

Picture this: You ask Claude about a topic, and it instantly pulls verified and trusted NYT content — no more guessing if the info is accurate.

The cool part? Publishers stay in control of what they share via API, and users get fast, reliable access through the AI tools they already use. Instead of scraping random stuff off the web, we get a future where publishers actively shape how their journalism shows up in AI.

It’s still a bit technical to set up right now, but this could get super simple soon — like installing apps on your phone, but for your chatbot. And you keep the brand connection, too.

Not saying it solves everything, but it’s definitely a new way to distribute content — and maybe even find some fresh value in the middle of this whole news + AI shakeup. Early movers will have a head start.

Curious what folks think — could MCPs be a real opportunity for journalism?

1 reply

·

thomwolf

posted an update 16 days ago

Post

4467

If you've followed the progress of robotics in the past 18 months, you've likely noticed how robotics is increasingly becoming the next frontier that AI will unlock.

At Hugging Face—in robotics and across all AI fields—we believe in a future where AI and robots are open-source, transparent, and affordable; community-built and safe; hackable and fun. We've had so much mutual understanding and passion working with the Pollen Robotics team over the past year that we decided to join forces!

You can already find our open-source humanoid robot platform Reachy 2 on the Pollen website and the Pollen community and people here on the hub at

pollen-robotics

We're so excited to build and share more open-source robots with the world in the coming months!

1 reply

·

fdaudens

posted an update 18 days ago

Post

2124

Want AI that truly understands your country's culture? Public institutions are sitting on the next AI revolution - and here's the practical guide to unlock it.

I've had fascinating conversations recently about sovereign AI, with people trying to solve this recurring question: "How do we build AI that truly understands our culture?"

This guide by @evijit and @yjernite brings lots of insights about this question. It's not just about throwing data at models. It's about partnering cultural expertise with tech infrastructure in ways we're just starting to figure out.

An example? The National Library of Norway already has 150+ AI models on Hugging Face. They're not just digitizing books - they're building AI that thinks in Norwegian, understands Norwegian values, and serves Norwegian citizens.

This is sovereign AI in practice: technology that understands your culture, values, and languages.

Especially loved the practical examples on how to do this:
- Real examples from museums, libraries, and government agencies
- How to convert complex documents (PDFs, PowerPoints) into ML-ready formats
- Code templates for processing public data
- Technical recipes for sharing datasets on open platforms

The stakes? Citizens' ability to leverage their collective digital intelligence.

The technology is ready. The infrastructure exists. The guide shows exactly how to use it. What's needed is your cultural expertise to shape these tools.

Check it out: https://huggingface.co/blog/evijit/public-org-data-ai

P.s.: Building cool projects in a public institution? Share them in the comments for others to learn from!

fdaudens

posted an update 20 days ago

Post

2816

Do chatbots lie about Céline Dion? We now have answers, not speculation.

Ai2 just released OLMoTrace and it's a game-changer for transparency. You can literally see where an AI's responses come from in its training data - in real time.

The demo shows results about Céline. So I tried it out myself! Watch what happens in the video.

For journalists, researchers studying hallucinations and anyone who needs to trust their AI, this is like getting X-ray vision into AI systems. When the model made claims, I could instantly verify them against original sources. When it hallucinated, I could see why.

You can finally 1) understand how LLMs actually work and 2) verify if what they're saying is true. No more blind trust.

This pushes the open data movement to the next level.

👉 Blog post: https://allenai.org/blog/olmotrace
👉 Paper: https://www.datocms-assets.com/64837/1743890415-olmotrace.pdf

P.S.: A word of caution: never use a chatbot as a knowledge base. It's not Google. Better use it with a connection to the internet.

1 reply

·

fdaudens

posted an update 20 days ago

Post

4092

🎨 Designers, meet OmniSVG! This new model helps you create professional vector graphics from text/images, generate editable SVGs from icons to detailed characters, convert rasters to vectors, maintain style consistency with references, and integrate into your workflow.

@OmniSVG

2 replies

·

davanstrien

posted an update 21 days ago

Post

1651

I've created a v1 dataset ( davanstrien/reasoning-required) and model ( davanstrien/ModernBERT-based-Reasoning-Required) to help curate "wild text" data for generating reasoning examples beyond the usual code/math/science domains.

- I developed a "Reasoning Required" dataset with a 0-4 scoring system for reasoning complexity
- I used educational content from HuggingFaceFW/fineweb-edu, adding annotations for domains, reasoning types, and example questions

My approach enables a more efficient workflow: filter text with small models first, then use LLMs only on high-value content.

This significantly reduces computation costs while expanding reasoning dataset domain coverage.

thomwolf

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

lewtun

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

nouamanetazi

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

anton-l

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

cyrilzakka

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

eliebak

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

lvwerra

authored a paper 22 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 23 days ago • 177

Open R1

AI & ML interests

Recent Activity

Articles

Open R1: Update #4

Open R1: Update #3

Open R1: Update #2

Open-R1: Update #1

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

open-r1's activity

Large Reasoning Models Leaderboard

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

AI & ML interests

Recent Activity

Articles

Open R1: Update #4

Open R1: Update #3

Open R1: Update #2

Open-R1: Update #1

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Team members 32

open-r1's activity

Large Reasoning Models Leaderboard