Text-to-Image
Diffusers
File size: 5,400 Bytes
286529d
 
655ea60
286529d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
license: apache-2.0
library_name: diffusers
---
# FaceScore

<p align="center">
   ๐Ÿ“ƒ <a href="https://arxiv.org/abs/2406.17100" target="_blank">Paper</a> โ€ข ๐Ÿค— <a href="https://huggingface.co/OPPOer/FaceScore" target="_blank">Checkpoints</a> 
</p>

**FaceScore: Benchmarking and Enhancing Face Quality in Human Generation**

Traditional facial quality assessment focuses on whether a face is suitable for recognition, while image aesthetic scorers emphasize overall aesthetics rather than details. FaceScore is the first reward model that focuses on faces in text-to-image models, designed to score the faces generated in images. It is fine-tuned on positive and negative sample pairs generated using an inpainting pipeline based on real face images and surpasses previous models in predicting human preferences for generated faces.

- [Install Dependency](#install-dependency)
- [Example Use](#example-use)
- [LoRA base on SDXL](#lora-based-on-sdxl)
- [Acknowledgement](#acknowledgement)
- [Citation](#citation)

## Install Dependency

This codebase relies heavily on [ImageReward](https://github.com/THUDM/ImageReward).
Please follow the instruction in it.
Besides, we introduce two addtional package.
You can install them as following:
```
pip install batch-face image-reward
```

## Example Use

We provide an example inference script in the directory of this repo. 
We also provide a real face image for testing. Note that the model can also score real face in the image, and no need to provide a specific prompt.


Use the following code to get the human preference scores from ImageReward:

```python
from FaceScore.FaceScore import FaceScore
import os 


face_score_model = FaceScore('FaceScore')
# load locally 
# face_score_model = FaceScore(path_to_checkpoint,med_config = path_to_config)

img_path = 'assets/Lecun.jpg'
face_score,box,confidences = face_score_model.get_reward(img_path)
print(f'The face score of {img_path} is {face_score}, and the bounding box of the face(s) is {box}')

```
You can also choose to load the model locally, after downloading the checkpoint in [FaceScore](https://huggingface.co/OPPOer/FaceScore/tree/main).

The output should be like as follow (the exact numbers may be slightly different depending on the compute device):

```
The face score of assets/Lecun.jpg is 3.993915319442749, and the bounding box of the faces is [[104.02845764160156, 28.232379913330078, 143.57421875, 78.53730773925781]]
```


## LoRA based on SDXL
We leverage FaceScore to filter data and perform direct preference optimization on SDXL.
The LoRA weight is [here](https://huggingface.co/OPPOer/FaceScore/tree/main).
Here we provide a quick example:
```
from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel
import torch

# load pipeline
inference_dtype = torch.float16
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=inference_dtype,
)
vae = AutoencoderKL.from_pretrained(
    'madebyollin/sdxl-vae-fp16-fix',
    torch_dtype=inference_dtype,
)
pipe.vae = vae
# You can load it locally
pipe.load_lora_weights("OPPOer/FaceScore/FaceLoRA")
pipe.to('cuda')

generator=torch.Generator(device='cuda').manual_seed(42)
image = pipe(
    prompt='A woman in a costume standing in the desert',
    guidance_scale=5.0,
    generator=generator,
    output_type='pil',
).images[0]
image.save('A woman in a costume standing in the desert.png')
```
We provide some examples generated by ours (right) and compare with the original SDXL (left) below.
<div style="display: flex; justify-content: space-around; text-align: center;">
    <div style="text-align: center;">
        <img src="assets/desert.jpg" alt="ๅ›พ็‰‡1" style="width: 600px;" />
        <p>A woman in a costume standing in the desert. </p>
    </div>
    <div style="text-align: center;">
        <img src="assets/scarf.jpg" alt="ๅ›พ็‰‡2" style="width: 600px;" />
        <p>A woman wearing a blue jacket and scarf.</p>
    </div>
</div>
<div style="display: flex; justify-content: space-around; text-align: center;">
    <div style="text-align: center;">
        <img src="assets/stage.jpg" alt="ๅ›พ็‰‡1" style="width: 600px;" />
        <p>A young woman in a blue dress performing on stage. </p>
    </div>
    <div style="text-align: center;">
        <img src="assets/striped.jpg" alt="ๅ›พ็‰‡2" style="width: 600px;" />
        <p>A woman with black hair and a striped shirt.</p>
    </div>
</div>
<div style="display: flex; justify-content: space-around; text-align: center;">
    <div style="text-align: center;">
        <img src="assets/sword.jpg" alt="ๅ›พ็‰‡1" style="width: 600px;" />
        <p>A woman with white hair and white armor is holding a sword. </p>
    </div>
    <div style="text-align: center;">
        <img src="assets/white.jpg" alt="ๅ›พ็‰‡2" style="width: 600px;" />
        <p>A woman with long black hair and a white shirt.</p>
    </div>
</div>

## Acknowledgement
Our codebase references the code from [ImageReward](https://github.com/THUDM/ImageReward). We extend our gratitude to the authors for open-sourcing their codes.

## Citation

```
@article{liao2024facescore,
  title={FaceScore: Benchmarking and Enhancing Face Quality in Human Generation},
  author={Liao, Zhenyi and Xie, Qingsong and Chen, Chen and Lu, Hannan and Deng, Zhijie},
  journal={arXiv preprint arXiv:2406.17100},
  year={2024}

```