Spaces:
Running
Running
File size: 2,065 Bytes
e685483 b28ea04 e685483 b28ea04 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
title: README
emoji: π«
colorFrom: yellow
colorTo: green
sdk: static
pinned: false
---
<p align="center">
<img src="https://raw.githubusercontent.com/ml6team/fondant/main/docs/art/fondant_banner.svg" height="250px"/>
</p>
<p align="center">
<i>Sweet data-centric foundation model fine-tuning</i>
<br>
<a href="https://fondant.readthedocs.io/en/stable/"><strong>Explore the docs Β»</strong></a>
<br>
<br>
<a href="https://discord.gg/HnTdWhydGp"><img alt="Discord" src="https://dcbadge.vercel.app/api/server/HnTdWhydGp?style=flat-square"></a>
</p>
---
**Fondant helps you create high quality datasets to train or fine-tune foundation models such as:**
- π¨ Stable Diffusion
- π GPT-like Large Language Models (LLMs)
- π CLIP
- βοΈ Segment Anything (SAM)
- β And many more
## πͺ€ Why Fondant?
Foundation models simplify inference by solving multiple tasks across modalities with a simple
prompt-based interface. But what they've gained in the front, they've lost in the back.
**These models require enormous amounts of data, moving complexity towards data preparation**, and
leaving few parties able to train their own models.
We believe that **innovation is a group effort**, requiring collaboration. While the community has
been building and sharing models, everyone is still building their data preparation from scratch.
**Fondant is the platform where we meet to build and share data preparation workflows.**
Fondant offers a framework to build **composable data preparation pipelines, with reusable
components, optimized to handle massive datasets**. Stop building from scratch, and start
reusing components to:
- Extend your data with public datasets
- Generate new modalities using captioning, segmentation, translation, image generation, ...
- Distill knowledge from existing foundation models
- Filter out low quality data
- Deduplicate data
And create high quality datasets to fine-tune your own foundation models.
<p align="right">(<a href="#chocolate_bar-fondant">back to top</a>)</p>
|