File size: 2,032 Bytes
5dbce35
 
83b09ef
 
 
5dbce35
83b09ef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: apache-2.0
language:
- en
library_name: diffusers
---

# Model Card for Arc2Face

<div align="center">

[**Project Page**](https://arc2face.github.io/) **|** [**Paper (ArXiv)**]() **|** [**Code**](https://github.com/foivospar/Arc2Face)


</div>

## Introduction

Arc2Face is an ID-conditioned face model, that can generate diverse, ID-consistent photos of a person given only its ArcFace ID-embedding.
It is trained on a restored version of the WebFace42M face recognition database, and is further fine-tuned on FFHQ and CelebA-HQ.

<div  align="center">
<img src='assets/samples_short.jpg'>
</div>

## Model Details

It consists of 2 components:
- encoder, a finetuned CLIP ViT-L/14 model
- arc2face, a finetuned UNet model

both of which are fine-tuned from [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5).
The encoder is tailored for projecting ID-embeddings to the CLIP latent space.
Arc2Face adapts the pre-trained backbone to the task of ID-to-face generation, conditioned solely on ID vectors.

## Usage

The models can be downloaded directly from this repository or using python:
```python
from huggingface_hub import hf_hub_download

hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/config.json", local_dir="./models/arc2face")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/diffusion_pytorch_model.safetensors", local_dir="./models/arc2face")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/config.json", local_dir="./models/encoder")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/pytorch_model.bin", local_dir="./models/encoder")
```

Please check our [GitHub repository](https://arc2face.github.io/) for complete inference instruction.

## Limitations and Bias

- Only one person per image can be generated.
- Poses are constrained to the frontal hemisphere, similar to FFHQ images.
- The model may reflect the biases of the training data or the ID encoder.

## Citation


**BibTeX:**

```bibtex
```