|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- oeg/CelebA_RoBERTa_Sp |
|
language: |
|
- es |
|
tags: |
|
- Spanish |
|
- CelebA |
|
- Roberta-base-bne |
|
- celebFaces Attributes |
|
pipeline_tag: text-to-image |
|
--- |
|
# RoBERTa base BNE trained with data from the descriptive text corpus of the CelebA dataset |
|
|
|
## Overview |
|
|
|
- **Language**: Spanish |
|
- **Data**: [CelebA_RoBERTa_Sp](https://huggingface.co/datasets/oeg/CelebA_RoBERTa_Sp). |
|
- **Architecture**: roberta-base |
|
- - **Paper**: [Information Processing and Management](https://doi.org/10.1016/j.ipm.2024.103667) |
|
|
|
## Description |
|
In order to improve the [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/)) |
|
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed: |
|
- Define [sentence-transformer](https://www.sbert.net/) and _torch_ libraries for the implementation of the encoder. |
|
- Divide the training corpus into two parts, training with 249,000 sentences and validation with 1,000 sentences. |
|
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them, |
|
the entries are composed of a pair of descriptive sentences and their similarity value. |
|
- Implement [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) as a baseline model for transformer training. |
|
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding |
|
vectors _u_ and _v_ generated using the cosine similarity metric (_CosineSimilarityLoss()_) are evaluated and compares with the real |
|
similarity value obtained from the training corpus. The performance measurement of the model during training was calculated using Spearman's correlation coefficient |
|
between the real similarity vector and the calculated similarity vector. |
|
|
|
The total training time using the _sentence-transformer_ library in Python was 42 days using all the available GPUs of the server, and with exclusive dedication. |
|
|
|
A comparison was made between the Spearman's correlation for 1000 test sentences between the base model and our trained model. |
|
As can be seen in the following table, our model obtains better results (correlation closer to 1). |
|
|
|
| Models | Spearman's correlation | |
|
| :---: | :---: | |
|
| RoBERTa-base-bne | 0.827176427 | |
|
| RoBERTa-celebA-Sp | 0.999913276 | |
|
|
|
## How to use |
|
Downloading the model results in a directory called **roberta-large-bne-celebAEs-UNI** that contains its main files. To make use of the model use the following code in Python: |
|
```python |
|
from sentence_transformers import SentenceTransformer, InputExample, models, losses, util, evaluation |
|
model_sbert = SentenceTransformer('roberta-large-bne-celebAEs-UNI') |
|
caption = ['La mujer tiene pomulos altos. Su cabello es de color negro. |
|
Tiene las cejas arqueadas y la boca ligeramente abierta. |
|
La joven y atractiva mujer sonriente tiene mucho maquillaje. |
|
Lleva aretes, collar y lapiz labial.'] |
|
vector = model_sbert.encode(captions) |
|
print(vector) |
|
``` |
|
## Results |
|
As a result, the encoder will generate a numeric vector whose dimension is 1024. |
|
|
|
```python |
|
>>$ print(vector) |
|
>>$ [0.2,0.5,0.45,........0.9] |
|
>>$ len(vector) |
|
>>$ 1024 |
|
``` |
|
|
|
## More information |
|
|
|
To see more detailed information about the implementation visit the [following link](https://github.com/eduar03yauri/DCGAN-text2face-forSpanish/blob/main/Data/encoder-models/RoBERTa_model_trained.md). |
|
|
|
## Licensing information |
|
This model is available under the [CC BY-NC 4.0.](https://creativecommons.org/licenses/by-nc/4.0/deed.es) |
|
|
|
## Citation information |
|
|
|
**Citing**: If you used RoBERTa+CelebA model in your work, please cite the paper publish in **[Information Processing and Management](https://doi.org/10.1016/j.ipm.2024.103667)**: |
|
|
|
```bib |
|
@article{YAURILOZANO2024103667, |
|
title = {Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish}, |
|
journal = {Information Processing & Management}, |
|
volume = {61}, |
|
number = {3}, |
|
pages = {103667}, |
|
year = {2024}, |
|
issn = {0306-4573}, |
|
doi = {https://doi.org/10.1016/j.ipm.2024.103667}, |
|
url = {https://www.sciencedirect.com/science/article/pii/S030645732400027X}, |
|
author = {Eduardo Yauri-Lozano and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro} |
|
} |
|
``` |
|
|
|
## Autors |
|
- [Eduardo Yauri Lozano](https://github.com/eduar03yauri) |
|
- [Manuel Castillo-Cara](https://github.com/manwestc) |
|
- [Raúl García-Castro](https://github.com/rgcmme) |
|
|
|
[*Universidad Nacional de Ingeniería*](https://www.uni.edu.pe/), [*Ontology Engineering Group*](https://oeg.fi.upm.es/), [*Universidad Politécnica de Madrid.*](https://www.upm.es/internacional) |
|
|
|
## Contributors |
|
See the full list of contributors and more resources [here](https://github.com/eduar03yauri/DCGAN-text2face-forSpanish). |
|
|
|
<kbd><img src="https://www.uni.edu.pe/images/logos/logo_uni_2016.png" alt="Universidad Politécnica de Madrid" width="100"></kbd> |
|
<kbd><img src="https://raw.githubusercontent.com/oeg-upm/TINTO/main/assets/logo-oeg.png" alt="Ontology Engineering Group" width="100"></kbd> |
|
<kbd><img src="https://raw.githubusercontent.com/oeg-upm/TINTO/main/assets/logo-upm.png" alt="Universidad Politécnica de Madrid" width="100"></kbd> |