Blog, Articles, and discussions

Seq vs Seq: the Ettin Suite of Paired Encoders and Decoders

By July 16, 2025 • 54

Community Articles

view all

Introducing Command A Vision: Multimodal AI built for Business

and 3 others •

7 days ago

• 61

What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models

and 5 others •

4 days ago

• 20

LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress)

•

2 days ago

• 6

Why We Built the OpenMDW License: A Comprehensive License for ML Models

•

Jul 2

• 23

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

•

May 7, 2024

• 93

Mastering Tensor Dimensions in Transformers

•

Jan 12

• 83

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 199

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

and 5 others •

May 21

• 32

Unlocking Healthcare AI: I'm Releasing State-of-the-Art Medical Models for Free. Forever.

•

22 days ago

• 133

Introducing ColQwen-Omni: Retrieve in every modality

and 4 others •

22 days ago

• 62

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation

•

5 days ago

• 4

From Zero to MCP: Three Lessons I Learned Building Tools for LLMs

•

8 days ago

• 7

📚 3LM: A Benchmark for Arabic LLMs in STEM and Code

and 7 others •

6 days ago

• 4

Measuring Open-Source Llama Nemotron Models on DeepResearch Bench

•

3 days ago

• 4

ColPali: Efficient Document Retrieval with Vision Language Models 👀

•

Jul 5, 2024

• 285

G2P Shrinks Speech Models

•

Feb 5

• 65

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

By August 22, 2023 • 36

Fine-tune Llama 2 with DPO

By August 8, 2023 • 59

Llama 2 is here - get it on Hugging Face

By July 18, 2023 • 28

Open-Source Text Generation & LLM Ecosystem at Hugging Face

By July 17, 2023 • 3

Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2

By June 29, 2023 • 3

What's going on with the Open LLM Leaderboard?

By June 23, 2023 • 43

Can foundation models label data like humans?

By June 12, 2023 • 1

Welcome fastText to the 🤗 Hub

By June 6, 2023 • 5

The Falcon has landed in the Hugging Face ecosystem

By June 5, 2023 • 16

Smaller is better: Q8-Chat, an efficient generative AI experience on Xeon

By May 16, 2023 • 2

Run a Chatgpt-like Chatbot on a Single GPU with ROCm

By May 15, 2023 • 2

Introducing RWKV — An RNN with the advantages of a transformer

By May 15, 2023 • 22

Assisted Generation: a new direction toward low-latency text generation

By May 11, 2023 • 69

Creating a Coding Assistant with StarCoder

By May 9, 2023 • 2

Community Articles

Introducing Command A Vision: Multimodal AI built for Business

and 3 others •

7 days ago

• 61

What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models

and 5 others •

4 days ago

• 20

Towards Open Evolutionary Agents

and 1 other •

3 days ago

• 12

Code a simple RAG from scratch

•

Oct 29, 2024

• 145

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 640

Introduction to State Space Models (SSM)

•

Jul 19, 2024

• 160

LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress)

•

2 days ago

• 6

Why We Built the OpenMDW License: A Comprehensive License for ML Models

•

Jul 2

• 23

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

•

May 7, 2024

• 93

Mastering Tensor Dimensions in Transformers

•

Jan 12

• 83

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 199

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

and 5 others •

May 21

• 32

Unlocking Healthcare AI: I'm Releasing State-of-the-Art Medical Models for Free. Forever.

•

22 days ago

• 133

Introducing ColQwen-Omni: Retrieve in every modality

and 4 others •

22 days ago

• 62

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation

•

5 days ago

• 4

From Zero to MCP: Three Lessons I Learned Building Tools for LLMs

•

8 days ago

• 7

📚 3LM: A Benchmark for Arabic LLMs in STEM and Code

and 7 others •

6 days ago

• 4

Measuring Open-Source Llama Nemotron Models on DeepResearch Bench

•

3 days ago

• 4

ColPali: Efficient Document Retrieval with Vision Language Models 👀

•

Jul 5, 2024

• 285

G2P Shrinks Speech Models

•

Feb 5

• 65

View all