Ever wanted 45 min with one of AIβs most fascinating minds? Was with @thomwolf at HumanX Vegas. Sharing my notes of his Q&A with the pressβcompletely changed how I think about AIβs future:
1οΈβ£ The next wave of successful AI companies wonβt be defined by who has the best model but by who builds the most useful real-world solutions. "We all have engines in our cars, but thatβs rarely the only reason we buy one. We expect it to work well, and thatβs enough. LLMs will be the same."
2οΈβ£ Big players are pivoting: "Closed-source companiesβOpenAI being the firstβhave largely shifted from LLM announcements to product announcements."
3οΈβ£ Open source is changing everything: "DeepSeek was open source AIβs ChatGPT moment. Basically, everyone outside the bubble realized you can get a model for freeβand itβs just as good as the paid ones."
4οΈβ£ Product innovation is being democratized: Take Manus, for exampleβthey built a product on top of Anthropicβs models thatβs "actually better than Anthropicβs own product for now, in terms of agents." This proves that anyone can build great products with existing models.
Weβre entering a "multi-LLM world," where models are becoming commoditized, and all the tools to build are readily availableβjust look at the flurry of daily new releases on Hugging Face.
Thom's comparison to the internet era is spot-on: "In the beginning you made a lot of money by making websites... but nowadays the huge internet companies are not the companies that built websites. Like Airbnb, Uber, Facebook, they just use the internet as a medium to make something for real life use cases."
Google just dropped an exciting technical report for the brand-new Gemma3 model! π Here are my personal notes highlighting the most intriguing architectural innovations, design choices, and insights from this release:
1) Architecture choices: > No more softcaping, replace by QK-Norm > Both Pre AND Post Norm > Wider MLP than Qwen2.5, ~ same depth > SWA with 5:1 and 1024 (very small and cool ablation on the paper!) > No MLA to save KV cache, SWA do the job!
2) Long context > Only increase the rope in the global layer (to 1M) > Confirmation that it's harder to do long context for smol models, no 128k for the 1B > Pretrained with 32k context? seems very high > No yarn nor llama3 like rope extension
3) Distillation > Only keep te first 256 logits for the teacher > Ablation on the teacher gap (tl;dr you need some "patience" to see that using a small teacher is better) > On policy distillation yeahh (by @agarwl_ et al), not sure if the teacher gap behave the same here, curious if someone have more info?
4) Others > Checkpoint with QAT, that's very cool > RL using improve version of BOND, WARM/WARP good excuse to look at @ramealexandre papers > Only use Zero3, no TP/PP if i understand correctly ? > Training budget relatively similar than gemma2
An assembly of 18 European companies, labs, and universities have banded together to launch πͺπΊ EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.
πͺπΊ 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi 3οΈβ£ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion β‘οΈ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common. βοΈ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported. π₯ A new Pareto frontier (stronger *and* smaller) for multilingual encoder models π Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight. π Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code.
The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!
Honored to be named among their 12 pioneers and power players in the news industry in the 2025 Tech Trends Report from Future Today Strategy Group.
Incredible group to be part of - each person is doing groundbreaking work at the intersection of AI and journalism. Worth following them all: they're consistently sharing practical insights on building the future of news.
Take the time to read this report, it's packed with insights as always. The news & information section's #1 insight hits hard: "The most substantive economic impact of AI to date has been licensing payouts for a handful of big publishers. The competition will start shifting in the year ahead to separate AI 'haves' that have positioned themselves to grow from the 'have-nots.'"
This AI-driven divide is something I've been really concerned about. Now is the time to build more than ever!
π New smolagents update: Safer Local Python Execution! π¦Ύπ
With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. π
Here's why this matters & what you need to know! π§΅π
1οΈβ£ Why is local execution risky? β οΈ AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.
2οΈβ£ New Safety Layer in smolagents π‘οΈ We now inspect every return value during execution: β Allowed: Safe built-in types (e.g., numbers, strings, lists) β Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)
4οΈβ£ Security Disclaimer β οΈ π¨ Despite these improvements, local Python execution is NEVER 100% safe. π¨ If you need true isolation, use a remote sandboxed executor like Docker or E2B.
5οΈβ£ The Best Practice: Use Sandboxed Execution π For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.
6οΈβ£ Upgrade Now & Stay Safe! π Check out the latest smolagents release and start building safer AI agents today.
π Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. π¦Ύπ
Here's why this is a game-changer for agent-based systems: π§΅π
1οΈβ£ Security First π Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.
2οΈβ£ Deterministic & Reproducible Runs π¦ By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable settingβno more environment mismatches or dependency issues!
3οΈβ£ Resource Control & Limits π¦ Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents donβt spiral out of control.
4οΈβ£ Safer Code Execution in Production π Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.
5οΈβ£ Easy to Integrate π οΈ With smolagents, you can simply configure your agent to use Docker or E2B as its execution backendβno need for complex security setups!
6οΈβ£ Perfect for Autonomous AI Agents π€ If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.