agents-course (Hugging Face Agents Course)

mcp-course

- It’s still free!
- Video 1 walks you through onboarding to the course
- The first live session is next week!
- You can now get a certificate via exam app
- We improved and written material with interactive quizzes

If you’re studying MCP and want a live, interactive, visual, certified course, then join us on the hub!

Jofthomas

posted an update 1 day ago

Post

1448

Meet our new agentic model : 𝗗𝗲𝘃𝘀𝘁𝗿𝗮𝗹

Devstral is an open-source LLM built software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌.

𝗞𝗲𝘆 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 :
• 🤖 𝗔𝗴𝗲𝗻𝘁𝘀 : perfect for Agentic coding
• 🍃 𝗹𝗶𝗴𝗵𝘁𝘄𝗲𝗶𝗴𝗵𝘁: Devstral is a 𝟮𝟰𝗕 parameter based on Mistral small.
• ©️ 𝗔𝗽𝗮𝗰𝗵𝗲 𝟮.𝟬, meaning fully open-source !
• 📄 A 𝟭𝟮𝟴𝗸 context window.

📚Blog : https://mistral.ai/news/devstral
⚡API : The model is also available on our API under the name 𝗱𝗲𝘃𝘀𝘁𝗿𝗮𝗹-𝘀𝗺𝗮𝗹𝗹-𝟮𝟱𝟬𝟱
🤗 repo : mistralai/Devstral-Small-2505

Can't wait to see what you will build with it !

burtenshaw

posted an update 7 days ago

Post

2913

We're thrilled to announce the launch of our comprehensive Model Context Protocol (MCP) Course! This free program is designed to take learners from foundational understanding to practical application of MCP in AI.

Follow the course on the hub:

mcp-course

In this course, you will:
📖 Study Model Context Protocol in theory, design, and practice.
🧑‍💻 Learn to use established MCP SDKs and frameworks.
💾 Share your projects and explore applications created by the community.
🏆 Participate in challenges and evaluate your MCP implementations.
🎓 Earn a certificate of completion.

At the end of this course, you'll understand how MCP works and how to build your own AI applications that leverage external data and tools using the latest MCP standards.

1 reply

·

m-ric

posted an update 10 days ago

Post

2458

𝗔𝗯𝘀𝗼𝗹𝘂𝘁𝗲 𝗭𝗲𝗿𝗼: 𝗟𝗟𝗠𝘀 𝗰𝗮𝗻 𝘁𝗿𝗮𝗶𝗻 𝘄𝗶𝘁𝗵𝗼𝘂𝘁 𝗮𝗻𝘆 𝗲𝘅𝘁𝗲𝗿𝗻𝗮𝗹 𝗱𝗮𝘁𝗮 🤯

Has the "data wall" just been breached?

Recent RL paradigms often relied on a set of questions an answers that needs to be manually curated. Researchers from Tsinghua University went like "why though".

🤔 Indeed, why learn from question designed by a human teacher, when the model can start from their base knowledge and learn by experimenting in a code environment, proposing coding tasks themselves and trying to solve them?

Thus they created “Absolute Zero Reasoning” (AZR), an approach that removes any need for human curated data.

🎭 𝗗𝘂𝗮𝗹 𝗿𝗼𝗹𝗲𝘀:
‣ Proposer: Generates challenging but solvable coding tasks
‣ Solver: Attempts to solve those self-proposed tasks

🧪 𝗧𝗵𝗿𝗲𝗲 𝘁𝗮𝘀𝗸 𝘁𝘆𝗽𝗲𝘀: all types are defined as triplets of program, input and output
‣ Deduction: Give model an input and program, it must deduce the output
‣ Abduction: Give model an program and output, it must find the input that gave said output
‣ Induction: Synthesize a program from input/output pairs
Btw this reminded me of my long-forgotten philosophy classes: Aristotle was more on the induction side, learning from real-world analogies, while Plato was more on the deduction side, trying to progress quite far with just one input and his reasoning.

📊 𝗥𝗲𝘀𝘂𝗹𝘁𝘀:
‣ AZR post-training creates a nice improvement on known models like Qwen2.5-7B
‣ Shows strong cross-domain transfer: coding ↔️ math reasoning

🧐 𝗢𝘁𝗵𝗲𝗿 𝗳𝗶𝗻𝗱𝗶𝗻𝗴𝘀:
‣ Having a better base performance (general or code specific) amplify the gains from Absolute Zero Reasoning
‣ Researchers warn about "Uh-oh moments" (winking to the "aha moments" of DeepSeek) where the model generates concerning goals like "make an extremely convoluted code to outsmart all these humans": so supervision is still needed!

Paper here: Absolute Zero: Reinforced Self-play Reasoning with Zero Data (2505.03335)

m-ric

posted an update 14 days ago

Post

4327

I've made an open version of Google's NotebookLM, and it shows the superiority of the open source tech task! 💪

The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.

The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.

And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.

Overall, open source solutions rival the quality of closed-source solutions at close to no cost!

Try it here 👉👉 m-ric/open-notebooklm

3 replies

·

burtenshaw

posted an update 21 days ago

Post

2078

Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model.

The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect.

✔️ training running
✔️ evals running
⏭️ improve dataset

The moe isn't going to fit into colab's A100 even with quantization (🙏 @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow.

burtenshaw/Qwen3-Code-Lite#1

burtenshaw

posted an update 29 days ago

Post

2501

The rebooted LLM course starts today with an overhauled chapter 1 on Transformers:

👉 Follow the org to join the course:

huggingface-course

We’re starting from the foundations of modern generative AI by looking at transformers. This chapter is expanded in depth and features so contains new material like:

FREE and CERTIFIED exam on fundamentals of transformers
deeper exploration of transformer architectures and attention mechanisms
end -to-end exploration of inference strategies for prefill and decode steps

The course has leveled up in complexity and depth, so this a great time to join in if you want to build you own AI models.

m-ric

posted an update about 1 month ago

Post

2816

New king of open VLMs: InternVL3 takes Qwen 2.5's crown! 👑

InternVL have been a wildly successful series of model : and the latest iteration has just taken back their crown thanks to their superior, natively multimodal vision training pipeline.

➡️ Most of the vision language models (VLMs) these days are built like Frankenstein : take a good text-only Large Language Model (LLM) backbone, stitch a specific vision transformer (ViT) on top of it. Then the training is sequential 🔢 : 1. Freeze the LLM weights while you train the ViT only to work with the LLM part, then 2. Unfreeze all weights to train all weights in order to work together.

💫 The Shanghai Lab decided to challenge this paradigm and chose this approach that they call "native". For each of their model sizes, they still start from a good LLM (mostly Qwen-2.5 series, did I tell you I'm a huge fan of Qwen? ❤️), and stitch the ViT, but they don't freeze anything : they train all weights together with interleaved text and image understanding data in a single pre-training phase 🎨.

They claim it results in more seamless interactions between modalities. And the results prove them right: they took the crown of top VLMs, at nearly all sizes, from their Qwen-2.5 parents. 👑

2 replies

·

burtenshaw

posted an update about 1 month ago

Post

1976

Hacked my presentation building with inference providers, Cohere command a, and sheer simplicity. Use this script if you’re burning too much time on presentations:

🔗 https://github.com/burtenshaw/course_generator/blob/main/scripts/create_presentation.py

This is what it does:
- uses command a to generates slides and speaker notes based on some material.
- it renders the material in remark open format and imports all images, tables, etc
- you can then review the slides as markdown and iterate
- export to either pdf or pptx using backslide

🚀 Next steps are: add text to speech for the audio and generate a video. This should make Hugging Face educational content scale to a billion AI Learners.

1 reply

·

thomwolf

posted an update about 1 month ago

Post

4919

If you've followed the progress of robotics in the past 18 months, you've likely noticed how robotics is increasingly becoming the next frontier that AI will unlock.

At Hugging Face—in robotics and across all AI fields—we believe in a future where AI and robots are open-source, transparent, and affordable; community-built and safe; hackable and fun. We've had so much mutual understanding and passion working with the Pollen Robotics team over the past year that we decided to join forces!

You can already find our open-source humanoid robot platform Reachy 2 on the Pollen website and the Pollen community and people here on the hub at

pollen-robotics

We're so excited to build and share more open-source robots with the world in the coming months!

1 reply

·

thomwolf

authored a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 184

pcuenq

authored a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 184

thomwolf

authored a paper about 1 month ago

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2 • 20

m-ric

posted an update about 2 months ago

Post

2373

🚀 DeepSeek R1 moment has come for GUI agents: Rule-based Reinforcement Learning gives better results than SFT with 500x smaller datasets!

Traditionally (by which I mean "in the last few months"), GUI agents have been trained with supervised fine-tuning (SFT). This meant, collecting huge datasets of screen captures from people using computers, and using these to fine-tune your model. 📚

👉 But last week, a new paper introduced UI-R1, applying DeepSeek's R1-style rule-based reinforcement learning (RL) specifically to GUI action prediction tasks.
This is big news: with RL, maybe we could build good agents without the need for huge datasets.

UI-R1 uses a unified reward function that evaluates multiple responses from models, optimizing via policy algorithms like Group Relative Policy Optimization (GRPO).

Specifically, the reward function assesses:
🎯 Action type accuracy: Does the predicted action match the ground truth?
📍 Coordinate accuracy (specifically for clicks): Is the predicted click within the correct bounding box?
📑 Output format: Does the model clearly articulate both its reasoning and final action?

Using just 136 carefully selected mobile tasks—compared to 76,000 tasks for larger models like OS-Atlas—UI-R1 shows significant efficiency and improved performance:
📈 Boosted action prediction accuracy from 76% to 89% on AndroidControl.
🌐 Outperformed larger, SFT-trained models (e.g., OS-Atlas-7B), demonstrating superior results with vastly fewer data points (136 tasks vs. 76K).
🔍 Enhanced adaptability and generalization, excelling even in out-of-domain scenarios.

The paper tests this RL-based method only in low-level GUI tasks. Could it generalize to more complex interactions? 🧐

Read the full paper here 👉 UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning (2503.21620)

thomwolf

posted an update about 2 months ago

Post

3440

The new DeepSite space is really insane for vibe-coders
enzostvs/deepsite

With the wave of vibe-coding-optimized LLMs like the latest open-source DeepSeek model (version V3-0324), you can basically prompt out-of-the-box and create any app and game in one-shot.

It feels so powerful to me, no more complex framework or under-the-hood prompt engineering to have a working text-to-app tool.

AI is eating the world and *open-source* AI is eating AI itself!

PS: and even more meta is that the DeepSite app and DeepSeek model are both fully open-source code => time to start recursively improve?

PPS: you still need some inference hosting unless you're running the 600B param model at home, so check the very nice list of HF Inference Providers for this model: deepseek-ai/DeepSeek-V3-0324

1 reply

·

burtenshaw

posted an update about 2 months ago

Post

3242

NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.

🔗

reasoning-course

This unit is super useful if you’re tuning models with reinforcement learning. It will help with:

- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions

This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.

📣 Shout out to @ShirinYamani who wrote the unit. Follow for more great content.

1 reply

·

burtenshaw

posted an update 2 months ago

Post

3867

The Hugging Face Agents Course now includes three major agent frameworks!

🔗

agents-course

This includes LlamaIndex, LangChain, and our very own smolagents. We've worked to integrate the three frameworks in distinctive ways so that learners can reflect on when and where to use each.

This also means that you can follow the course if you're already familiar with one of these frameworks, and soak up some of the fundamental knowledge in earlier units.

Hopefully, this makes the agents course as open to as many people as possible.

3 replies

·

Hugging Face Agents Course

AI & ML interests

Recent Activity

agents-course's activity

agents-course/certificates

agents-course/unit4-students-scores

agents-course/course-certificates-of-excellence

SmolVLM: Redefining small and efficient multimodal models

SmolVLM: Redefining small and efficient multimodal models

YourBench: Easy Custom Evaluation Sets for Everyone

AI & ML interests

Recent Activity

Team members 11

agents-course's activity