Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: creativeml-openrail-m
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
library_name: diffusers
|
6 |
+
pipeline_tag: text-to-image
|
7 |
+
tags:
|
8 |
+
- stable-diffusion
|
9 |
+
---
|
10 |
+
# 🧩 TokenCompose SD21 Model Card
|
11 |
+
|
12 |
+
[TokenCompose_SD21_B](https://mlpc-ucsd.github.io/TokenCompose/) is a [latent text-to-image diffusion model](https://arxiv.org/abs/2112.10752) finetuned from the [**Stable-Diffusion-v2-1**](https://huggingface.co/stabilityai/stable-diffusion-2-1) checkpoint at resolution 768x768 on the [VSR](https://github.com/cambridgeltl/visual-spatial-reasoning) split of [COCO image-caption pairs](https://cocodataset.org/#download) for 32,000 steps with a learning rate of 5e-6. The training objective involves token-level grounding terms in addition to denoising loss for enhanced multi-category instance composition and photorealism. The "_A/B" postfix indicates different finetuning runs of the model using the same above configurations.
|
13 |
+
|
14 |
+
# 🧨Example Usage
|
15 |
+
|
16 |
+
We strongly recommend using the [🤗Diffuser](https://github.com/huggingface/diffusers) library to run our model.
|
17 |
+
|
18 |
+
```python
|
19 |
+
import torch
|
20 |
+
from diffusers import StableDiffusionPipeline
|
21 |
+
|
22 |
+
model_id = "mlpc-lab/TokenCompose_SD21_B"
|
23 |
+
device = "cuda"
|
24 |
+
|
25 |
+
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
|
26 |
+
pipe = pipe.to(device)
|
27 |
+
|
28 |
+
prompt = "A cat and a wine glass"
|
29 |
+
image = pipe(prompt).images[0]
|
30 |
+
|
31 |
+
image.save("cat_and_wine_glass.png")
|
32 |
+
```
|
33 |
+
|
34 |
+
# ⬆️Improvements over SD21
|
35 |
+
|
36 |
+
| Model | Object Accuracy | MG3 COCO | MG4 COCO | MG5 COCO | MG3 ADE20K | MG4 ADE20K | MG5 ADE20K | FID COCO |
|
37 |
+
|---------------------|-----------------|----------|----------|----------|------------|------------|------------|----------|
|
38 |
+
| SD21 | 47.82 | 70.14 | 25.57 | 3.27 | 75.13 | 35.07 | 7.16 | 19.59 |
|
39 |
+
| TokenCompose (SD21) | 60.10 | 80.48 | 36.69 | 5.71 | 79.51 | 39.59 | 8.13 | 19.15 |
|
40 |
+
|
41 |
+
# 📰 Citation
|
42 |
+
Coming soon!
|