weifeng-chen commited on
Commit
a7e4b37
·
1 Parent(s): 37eb6c1
README.md CHANGED
@@ -1,16 +1,10 @@
1
  ---
2
  license: apache-2.0
3
- # inference: false
4
- # pipeline_tag: zero-shot-image-classification
5
- pipeline_tag: feature-extraction
6
 
7
- # inference:
8
- # parameters:
9
  tags:
10
- - clip
11
- - zh
12
- - image-text
13
- - feature-extraction
14
  ---
15
 
16
  # Taiyi-Stable-Diffusion-1B-Chinese-v0.1
@@ -34,16 +28,35 @@ The first open source Chinese Stable diffusion, which was trained on 20M filtere
34
 
35
  我们将[Noah-Wukong](https://wukong-dataset.github.io/wukong-dataset/)数据集(100M)和[Zero](https://zero.so.com/)数据集(23M)用作预训练的数据集,先用[IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese)对这两个数据集的图文对相似性进行打分,取CLIP Score大于0.2的图文对作为我们的训练集。 我们使用[IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese)作为初始化的text encoder,冻住模型的其他部分,只训练text encoder,以便保留原始模型的生成能力且实现中文概念的对齐。该模型目前在0.2亿图文对上finetune了一个epoch。
36
 
37
- We use [Noah-Wukong](https://wukong-dataset.github.io/wukong-dataset/)(100M) 和 [Zero](https://zero.so.com/)(23M) as our dataset, and take the image and text pairs with CLIP Score (based on [IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese)) greater than 0.2 as our Training set. We utilize [IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese) as our init model. To keep the powerful generative capability of stable diffusion and align Chinese concept with the iamges, We only train the text encoder and freeze other part of the model.
38
 
39
- ### 下游效果 Performance
 
 
40
 
 
 
41
 
42
-
43
- ## 使用 Usage
44
-
45
 
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  ## 引用 Citation
49
 
 
1
  ---
2
  license: apache-2.0
 
 
 
3
 
 
 
4
  tags:
5
+ - stable-diffusion
6
+ - stable-diffusion-diffusers
7
+ - text-to-image
 
8
  ---
9
 
10
  # Taiyi-Stable-Diffusion-1B-Chinese-v0.1
 
28
 
29
  我们将[Noah-Wukong](https://wukong-dataset.github.io/wukong-dataset/)数据集(100M)和[Zero](https://zero.so.com/)数据集(23M)用作预训练的数据集,先用[IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese)对这两个数据集的图文对相似性进行打分,取CLIP Score大于0.2的图文对作为我们的训练集。 我们使用[IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese)作为初始化的text encoder,冻住模型的其他部分,只训练text encoder,以便保留原始模型的生成能力且实现中文概念的对齐。该模型目前在0.2亿图文对上finetune了一个epoch。
30
 
31
+ We use [Noah-Wukong](https://wukong-dataset.github.io/wukong-dataset/)(100M) 和 [Zero](https://zero.so.com/)(23M) as our dataset, and take the image and text pairs with CLIP Score (based on [IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese)) greater than 0.2 as our Training set. We use [IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese](https://huggingface.co/IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese) as our init text encoder. To keep the powerful generative capability of stable diffusion and align Chinese concepts with the images, We only train the text encoder and freeze other part of the model.
32
 
33
+ ### Result
34
+ 飞流直下三千尺,素描。
35
+ ![](result_examples/feiliu1.png)
36
 
37
+ 飞流直下三千尺,摄影。
38
+ ![](result_examples/feiliu2.png)
39
 
40
+ 飞流直下三千尺,油画。
41
+ ![](result_examples/feiliu3.png)
 
42
 
43
 
44
+ ## 使用 Usage
45
+ ```
46
+ from diffusers import StableDiffusionPipeline
47
+ from transformers import CLIPTextModel
48
+ import torch
49
+ import os
50
+ os.environ['CUDA_VISIBLE_DEVICES'] = '1'
51
+
52
+ pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1").to("cuda")
53
+
54
+ prompt = '飞流直下三千尺,油画'
55
+ with autocast("cuda"):
56
+ image = pipe(prompt, guidance_scale=7.5).images[0]
57
+
58
+ image.save("飞流.png")
59
+ ```
60
 
61
  ## 引用 Citation
62
 
result_examples/feiliu1.png ADDED
result_examples/feiliu2.png ADDED
result_examples/feiliu3.png ADDED