Alvaro Bartolome's picture

Alvaro Bartolome

alvarobartt

·

https://alvarobartt.me

AI & ML interests

machine learning @huggingface

Recent Activity

liked a model 3 days ago

deepseek-ai/DeepSeek-R1-Zero

liked a model 4 days ago

mistralai/Mistral-Small-24B-Instruct-2501

liked a model 4 days ago

mistralai/Mistral-Small-24B-Base-2501

View all activity

Articles

🤗 Serve any model with Inference Endpoints + Custom Handlers

Introducing HUGS - Scale your AI with Open Models

Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

Deploying 🤗 Hub models in Vertex AI

🏷️ Build AI Feedback (AIF) datasets for LLM alignment with ⚗️ distilabel

💨 Introducing Notus: a DPO fine-tune of Zephyr with a focus on high-quality data

🤗 LLM suggestions in Argilla with HuggingFace Inference Endpoints

Organizations

alvarobartt's activity

upvoted a paper 6 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 12 days ago • 281

upvoted 2 articles 6 days ago

Article

Welcome to Inference Providers on the Hub 🔥

6 days ago

• 205

Article

Open-R1: a fully open reproduction of DeepSeek-R1

6 days ago

• 560

upvoted an article 14 days ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

• 139

upvoted a paper 17 days ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published 19 days ago • 52

upvoted a collection 20 days ago

MiniCPM

The MiniCPM family of LLMs and VLLMs. • 32 items • Updated 15 days ago • 61

upvoted a paper 21 days ago

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published 24 days ago • 87

upvoted 2 collections 25 days ago

Jan 10 Releases 🌨️

38 items • Updated 24 days ago • 12

Phi-4 (All Versions)

Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 4 items • Updated about 5 hours ago • 39

upvoted a paper about 1 month ago

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7, 2024 • 13

upvoted a collection about 2 months ago

2024 Interconnects Artifacts

Models & datasets mentioned in the bottom section of posts! • 280 items • Updated Jan 2 • 6

upvoted a paper about 2 months ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 46

upvoted 3 collections about 2 months ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 134

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated 19 days ago • 113

Llama 3.3

This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 126

upvoted a collection 2 months ago

SmolVLM

State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated Dec 22, 2024 • 32

upvoted a paper 2 months ago

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17, 2024 • 51

upvoted an article 2 months ago

Article

Use Models from the Hugging Face Hub in LM Studio

By

•

Nov 28, 2024

• 135

upvoted 2 collections 3 months ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 273

AMD-OLMo

AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated Oct 31, 2024 • 18