Building Powerful AI Agents with smolagents: A Minimalist Approach

If you've been following our previous discussions on AI agents, you know that frameworks often add unnecessary complexity to what should be a straightforward process. While we've shown how to build agents from scratch without frameworks, there are times when a minimal abstraction layer can save you from writing repetitive boilerplate code.
That's where smolagents enters the ring - a refreshingly minimal library for building powerful AI agents in just a few lines of code, built by the nice folks at huggingface.
Why smolagents Stands Out
Unlike bloated frameworks that obscure the core agent functionality behind layers of abstraction, smolagents keeps things remarkably simple:
- Truly minimal: The entire agent logic fits in ~1,000 lines of code
- Maximum transparency: You can easily inspect and understand what's happening under the hood
- First-class code execution: Agents write actions as Python code snippets rather than structured tool calls - more on that later.
- Model-agnostic: Works with virtually any LLM from OpenAI, Anthropic, Hugging Face, or local models
- Multi-modal support: Handles text, vision, video, and even audio inputs, similar to what we explored in our analyzing images with GPT-4o guide
- Extensive tool compatibility: Use tools from LangChain, Anthropic's MCP, or even Hugging Face Spaces
Let's experience hands-on how smolagents provides the perfect balance between simplicity and power.
Getting Started with smolagents
To warm up, let's run a simple agent using smolagents.
First, install the package:
Then define your agent, give it the tools it needs, and run it:
That's it! The agent will:
- Parse the question
- Decide to search for information about leopard speed and the Pont des Arts length
- Write and execute Python code to calculate the answer
- Return a complete, well-reasoned response
Note: Smolagents by default outputs a lot of debug information,
helping you understand what's happening at each step. You can disable
this by setting verbosity_level=LogLevel.ERROR
in the agent
constructor.
How Code Agents Work - and why they're better than traditional agents
Unlike traditional agents that use structured JSON for tool calls,
smolagents' CodeAgent
writes its actions as Python code snippets. This approach is particularly powerful when working with complex data structures, as we've seen when creating knowledge graphs with Neo4j. For example, if your question was about the speed of a leopard and the length
of the Pont des Arts bridge in Paris, the agent might generate code like
this:
This approach has been shown to:
- Use 30% fewer steps (and thus 30% fewer LLM calls)
- Achieve higher performance on difficult benchmarks
- Provide more flexibility in how tools are used
Security Considerations
Since code execution can be (or actually is) a security concern, smolagents provides several options:
- A secure Python interpreter to run code more safely in your environment: Hugging Face created a new, limited Python interpreter which only allows to import whitelisted modules and even only whitelisted submodules. This is the default behavior.
- Sandboxed environments using E2B or Docker to isolate execution: This allows to execute the agent code in remote, more isolated environments
- The ability to review code before execution
Comparing smolagents with Other Frameworks
If you've read our previous post on building agents from scratch, you might wonder why use any framework at all. The answer lies in the balance:
Approach | Pros | Cons |
---|---|---|
Raw Implementation | Complete control, No dependencies | Repetitive boilerplate, Reinventing the wheel |
smolagents | Minimal abstraction, Transparent code, Flexible | Small learning curve |
Large Frameworks | Many pre-built components | Complex abstractions, Hard to debug, Less flexible |
smolagents handles the non-trivial parts (like maintaining consistent code formats throughout the system prompt, parser, and execution) while keeping everything else transparent and accessible.
Model Compatibility of smolagents
smolagents works with virtually any LLM:
Sharing Your Agents created with smolagents
One of the most powerful features of smolagents is the ability to share your agents on Hugging Face Hub:
Command Line Interface
smolagents also provides convenient CLI commands:
Note: To run the web-agent, you need to install the helium package.
Best Practices for Building Effective Agents
When building agents with smolagents, following some key principles can dramatically improve performance:
Simplify Your Workflow
The best agentic systems are often the simplest ones. Giving an LLM agency introduces some risk of errors, so simplifying your workflow is crucial:
- Reduce the number of LLM calls: Whenever possible, group multiple tools into one. For example, instead of separate "travel distance API" and "weather API" calls, create a unified "return_spot_information" function.
- Prefer deterministic functions: When possible, use deterministic functions rather than agentic decisions to reduce error risk. This approach reduces costs, latency, and error risk simultaneously!
Improve Information Flow to the LLM
Remember that your LLM engine is like an intelligent robot trpped in a room, with notes passed under the door as its only communication with the outside world. It won't know anything you don't explicitly tell it.
For example, here's a poorly designed weather tool:
And here's a better version:
The improved version provides clear format requirements, better error handling, and a structured output format.
Tips when working with smolagents
Enable Planning Mode
smolagents provides a powerful planning feature that helps agents reflect on their progress and plan next steps:
This adds a supplementary planning step where the LLM updates its knowledge and reflects on what steps to take next, without making any tool calls.
Reduce logging output
By default, smolagents logs a lot of information to help you understandwhat's happening at each step. If you find this overwhelming,you can reduce the verbosity level:
Change final answer output format
The smolagents agent uses the FinalAnswerTool
to generate the final,
user-facing answer. By default, this tool attempts to create a natural
language answer. However, we found that sometimes the LLM answered with
a JSON object.
Thankfully, it's quite easy to change the output format:
Streaming the agent output
As LLM outputs are generally speaking quite slow, chat applications often provide the answer as message stream - as this gives the user immediate response - even though the full answer is not yet available.
Smolagents also allows you to stream the output. Simply set the stream
parameter to True
. This gives a generator that yields messages as they
become available:
Interested in how to train your very own Large Language Model?
We prepared a well-researched guide for how to use the latest advancements in Open Source technology to fine-tune your own LLM. This has many advantages like:
- Cost control
- Data privacy
- Excellent performance - adjusted specifically for your intended use
Conclusion
smolagents strikes a good balance between simplicity and power. It handles the complex parts of agent development while keeping everything transparent and accessible.
By focusing on code-based actions rather than structured tool calls, smolagents enables more flexible, efficient agents that can tackle complex tasks with fewer steps and higher success rates.
If you've been frustrated by the complexity of larger frameworks or tired of writing the same boilerplate code for every agent project, smolagents might be exactly what you're looking for.
Further Reading
-
How to Build AI Agents Without Frameworks - Learn how to build AI agents without frameworks for complete control and understanding
-
Enhancing Self-Made Agent with Memory - Discover how to extend your agents with memory capabilities for more coherent conversations
-
AI Agents with n8n - Explore another lightweight approach to building AI agents using n8n workflow automation