--- title: README emoji: 🍫 colorFrom: yellow colorTo: green sdk: static pinned: false ---

Sweet data-centric foundation model fine-tuning
Explore the docs »

--- **Fondant helps you create high quality datasets to train or fine-tune foundation models such as:** - 🎨 Stable Diffusion - 📄 GPT-like Large Language Models (LLMs) - 🔎 CLIP - ✂️ Segment Anything (SAM) - ➕ And many more ## 🪤 Why Fondant? Foundation models simplify inference by solving multiple tasks across modalities with a simple prompt-based interface. But what they've gained in the front, they've lost in the back. **These models require enormous amounts of data, moving complexity towards data preparation**, and leaving few parties able to train their own models. We believe that **innovation is a group effort**, requiring collaboration. While the community has been building and sharing models, everyone is still building their data preparation from scratch. **Fondant is the platform where we meet to build and share data preparation workflows.** Fondant offers a framework to build **composable data preparation pipelines, with reusable components, optimized to handle massive datasets**. Stop building from scratch, and start reusing components to: - Extend your data with public datasets - Generate new modalities using captioning, segmentation, translation, image generation, ... - Distill knowledge from existing foundation models - Filter out low quality data - Deduplicate data And create high quality datasets to fine-tune your own foundation models.