Spaces:
Running
title: README
emoji: π«
colorFrom: yellow
colorTo: green
sdk: static
pinned: false
Sweet data-centric foundation model fine-tuning
Explore the docs Β»
Fondant helps you create high quality datasets to train or fine-tune foundation models such as:
- π¨ Stable Diffusion
- π GPT-like Large Language Models (LLMs)
- π CLIP
- βοΈ Segment Anything (SAM)
- β And many more
πͺ€ Why Fondant?
Foundation models simplify inference by solving multiple tasks across modalities with a simple prompt-based interface. But what they've gained in the front, they've lost in the back. These models require enormous amounts of data, moving complexity towards data preparation, and leaving few parties able to train their own models.
We believe that innovation is a group effort, requiring collaboration. While the community has been building and sharing models, everyone is still building their data preparation from scratch. Fondant is the platform where we meet to build and share data preparation workflows.
Fondant offers a framework to build composable data preparation pipelines, with reusable components, optimized to handle massive datasets. Stop building from scratch, and start reusing components to:
- Extend your data with public datasets
- Generate new modalities using captioning, segmentation, translation, image generation, ...
- Distill knowledge from existing foundation models
- Filter out low quality data
- Deduplicate data
And create high quality datasets to fine-tune your own foundation models.