π New smolagents update: Safer Local Python Execution! π¦Ύπ
With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. π
Here's why this matters & what you need to know! π§΅π
1οΈβ£ Why is local execution risky? β οΈ AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.
2οΈβ£ New Safety Layer in smolagents π‘οΈ We now inspect every return value during execution: β Allowed: Safe built-in types (e.g., numbers, strings, lists) β Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)
4οΈβ£ Security Disclaimer β οΈ π¨ Despite these improvements, local Python execution is NEVER 100% safe. π¨ If you need true isolation, use a remote sandboxed executor like Docker or E2B.
5οΈβ£ The Best Practice: Use Sandboxed Execution π For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.
6οΈβ£ Upgrade Now & Stay Safe! π Check out the latest smolagents release and start building safer AI agents today.
βοΈ modification of the cross-entropy loss function designed specifically for training LLMs. βοΈ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance. βοΈ more stable and efficient training, leading to models that generalize better.
Check it out, give it a spin, and let me know what you think!
Licensed under the Apache 2.0 license and ready to use. Happy training! π₯π€
made a few improvements on custom grpo trainer: - added sequence similarity reward (seems to work) - improved vllm support (5x inference speed) - adjusted reward scores (this helped with format/accuracy) - can now push to hf hub (already pushed mine lol: Jaward/smollm2_360m_grpo_gsm8k_reasoner)