Aymeric Roucher's picture

Aymeric Roucher

m-ric

·

http://aymeric-roucher.github.io

AI & ML interests

Leading Agents at Hugging Face 🤗

Recent Activity

upvoted an article 1 day ago

LeRobot goes to driving school: World’s largest open-source self-driving dataset

posted an update 3 days ago

Our new Agentic leaderboard is now live!💥 If you ever asked which LLM is best for powering agents, we've just made a leaderboard that ranks them all! Built with @albertvillanova, this ranks LLMs powering a smolagents CodeAgent on subsets of various benchmarks. ✅ 🏆 GPT-4.5 comes on top, even beating reasoning models like DeepSeek-R1 or o1. And Claude-3.7-Sonnet is a close second! The leaderboard also allows you to show the scores of vanilla LLMs (without any agentic setup) on the same benchmarks: this shows the huge improvements brought by agentic setups. 💪 (Note that results will be added manually, so the leaderboard might not always have the latest LLMs)

updated a Space 3 days ago

smolagents/smolagents-leaderboard

View all activity

Organizations

m-ric's activity

upvoted an article 1 day ago

Article

LeRobot goes to driving school: World’s largest open-source self-driving dataset

3 days ago

• 38

upvoted 4 papers 5 days ago

GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents

Paper • 2406.10819 • Published Jun 16, 2024 • 1

GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices

Paper • 2406.08451 • Published Jun 12, 2024 • 25

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published Jan 21 • 54

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

Paper • 2310.11441 • Published Oct 17, 2023 • 28

upvoted an article 13 days ago

Article

Trace & Evaluate your Agent with Arize Phoenix

14 days ago

• 31

upvoted an article 15 days ago

Article

FastRTC: The Real-Time Communication Library for Python

17 days ago

• 141

upvoted a paper 21 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 22 days ago • 161

upvoted an article 23 days ago

Article

Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥

24 days ago

• 93

upvoted 2 articles 27 days ago

Article

1 Billion Classifications

29 days ago

• 42

Article

Welcome Fireworks.ai on the Hub 🎆

28 days ago

• 55

upvoted 2 papers 29 days ago

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5 • 58

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 203

upvoted a paper 30 days ago

Towards Retrieval Augmented Generation over Large Video Libraries

Paper • 2406.14938 • Published Jun 21, 2024 • 21

upvoted 6 articles about 1 month ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Feb 4

• 111

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.16k

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

Feb 4

• 61

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 294

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

By

•

Jan 31

• 43

Article

Anthropic CEO: is DeepSeek-R1 a revolution in AI?

By

•

Jan 30

• 6