--- title: README emoji: ๐Ÿซ colorFrom: yellow colorTo: green sdk: static pinned: false ---

Sweet data-centric foundation model fine-tuning
Explore the docs ยป

Discord

--- **Fondant helps you create high quality datasets to train or fine-tune foundation models such as:** - ๐ŸŽจ Stable Diffusion - ๐Ÿ“„ GPT-like Large Language Models (LLMs) - ๐Ÿ”Ž CLIP - โœ‚๏ธ Segment Anything (SAM) - โž• And many more ## ๐Ÿชค Why Fondant? Foundation models simplify inference by solving multiple tasks across modalities with a simple prompt-based interface. But what they've gained in the front, they've lost in the back. **These models require enormous amounts of data, moving complexity towards data preparation**, and leaving few parties able to train their own models. We believe that **innovation is a group effort**, requiring collaboration. While the community has been building and sharing models, everyone is still building their data preparation from scratch. **Fondant is the platform where we meet to build and share data preparation workflows.** Fondant offers a framework to build **composable data preparation pipelines, with reusable components, optimized to handle massive datasets**. Stop building from scratch, and start reusing components to: - Extend your data with public datasets - Generate new modalities using captioning, segmentation, translation, image generation, ... - Distill knowledge from existing foundation models - Filter out low quality data - Deduplicate data And create high quality datasets to fine-tune your own foundation models.

(back to top)