🚀 New smolagents update: Safer Local Python Execution! 🦾🐍
With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. 🔒
Here's why this matters & what you need to know! 🧵👇
1️⃣ Why is local execution risky? ⚠️ AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.
2️⃣ New Safety Layer in smolagents 🛡️ We now inspect every return value during execution: ✅ Allowed: Safe built-in types (e.g., numbers, strings, lists) ⛔ Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)
4️⃣ Security Disclaimer ⚠️ 🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨 If you need true isolation, use a remote sandboxed executor like Docker or E2B.
5️⃣ The Best Practice: Use Sandboxed Execution 🔐 For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.
6️⃣ Upgrade Now & Stay Safe! 🚀 Check out the latest smolagents release and start building safer AI agents today.
🚀 Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. 🦾🔒
Here's why this is a game-changer for agent-based systems: 🧵👇
1️⃣ Security First 🔐 Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.
2️⃣ Deterministic & Reproducible Runs 📦 By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable setting—no more environment mismatches or dependency issues!
3️⃣ Resource Control & Limits 🚦 Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents don’t spiral out of control.
4️⃣ Safer Code Execution in Production 🏭 Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.
5️⃣ Easy to Integrate 🛠️ With smolagents, you can simply configure your agent to use Docker or E2B as its execution backend—no need for complex security setups!
6️⃣ Perfect for Autonomous AI Agents 🤖 If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.
In just 24 hours, we built an open-source agent that: ✅ Autonomously browse the web ✅ Search, scroll & extract info ✅ Download & manipulate files ✅ Run calculations on data
Introducing 📐𝐅𝐢𝐧𝐞𝐌𝐚𝐭𝐡: the best public math pre-training dataset with 50B+ tokens! HuggingFaceTB/finemath
Math remains challenging for LLMs and by training on FineMath we see considerable gains over other math datasets, especially on GSM8K and MATH.
We build the dataset by: 🛠️ carefully extracting math data from Common Crawl; 🔎 iteratively filtering and recalling high quality math pages using a classifier trained on synthetic annotations to identify math reasoning and deduction.
We conducted a series of ablations comparing the performance of Llama-3.2-3B-Base after continued pre-training on FineMath and observe notable gains compared to the baseline model and other public math datasets.
We hope this helps advance the performance of LLMs on math and reasoning! 🚀 We’re also releasing all the ablation models as well as the evaluation code.