TRL documentation

TRL - Transformer Reinforcement Learning

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

TRL - Transformer Reinforcement Learning

TRL is a full stack library where we provide a set of tools to train transformer language models with methods like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), Reward Modeling, and more. The library is integrated with 🤗 transformers.

You can also explore TRL-related models, datasets, and demos in the TRL Hugging Face organization.

Learn

Learn post-training with TRL and other libraries in 🤗 smol course.

Contents

The documentation is organized into the following sections:

  • Getting Started: installation and quickstart guide.
  • Conceptual Guides: dataset formats, training FAQ, and understanding logs.
  • How-to Guides: reducing memory usage, speeding up training, distributing training, etc.
  • Integrations: DeepSpeed, Liger Kernel, PEFT, etc.
  • Examples: example overview, community tutorials, etc.
  • API: trainers, utils, etc.

Blog posts

< > Update on GitHub