---
license: cc-by-nc-4.0
datasets:
- oeg/CelebA_RoBERTa_Sp
language:
- es
tags:
- Spanish
- CelebA
- Roberta-base-bne
- celebFaces Attributes
pipeline_tag: text-to-image
---
# RoBERTa base BNE trained with data from the descriptive text corpus of the CelebA dataset

## Overview

- **Language**: Spanish
- **Data**: [CelebA_RoBERTa_Sp](https://huggingface.co/datasets/oeg/CelebA_RoBERTa_Sp).
- **Architecture**: roberta-base
- - **Paper**: [Information Processing and Management](https://doi.org/10.1016/j.ipm.2024.103667) 
  
## Description
In order to improve the [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/)) 
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:
- Define [sentence-transformer](https://www.sbert.net/) and _torch_ libraries for the implementation of the encoder. 
- Divide the training corpus into two parts, training with 249,000 sentences and validation with 1,000 sentences.
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them,
  the entries are composed of a pair of descriptive sentences and their similarity value.
- Implement [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) as a baseline model for transformer training.
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding
  vectors _u_ and _v_ generated using the cosine similarity metric (_CosineSimilarityLoss()_) are evaluated and compares with the real
  similarity value obtained from the training corpus. The performance measurement of the model during training was calculated using Spearman's correlation coefficient
  between the real similarity vector and the calculated similarity vector.

The total training time using the _sentence-transformer_ library in Python was 42 days using all the available GPUs of the server, and with exclusive dedication.

A comparison was made between the Spearman's correlation for 1000 test sentences between the base model and our trained model. 
As can be seen in the following table, our model obtains better results (correlation closer to 1).

| Models            | Spearman's correlation |
|    :---:          |     :---: |
| RoBERTa-base-bne  | 0.827176427 | 
| RoBERTa-celebA-Sp | 0.999913276 | 

## How to use
Downloading the model results in a directory called **roberta-large-bne-celebAEs-UNI** that contains its main files. To make use of the model use the following code in Python:
```python
from sentence_transformers import SentenceTransformer, InputExample, models, losses, util, evaluation
model_sbert = SentenceTransformer('roberta-large-bne-celebAEs-UNI')
caption = ['La mujer tiene pomulos altos. Su cabello es de color negro.
            Tiene las cejas arqueadas y la boca ligeramente abierta.
            La joven y atractiva mujer sonriente tiene mucho maquillaje.
            Lleva aretes, collar y lapiz labial.']
vector = model_sbert.encode(captions)
print(vector)
```
## Results
As a result, the encoder will generate a numeric vector whose dimension is 1024.

```python
>>$ print(vector)
>>$ [0.2,0.5,0.45,........0.9]
>>$ len(vector)
>>$ 1024
```

## More information

To see more detailed information about the implementation visit the [following link](https://github.com/eduar03yauri/DCGAN-text2face-forSpanish/blob/main/Data/encoder-models/RoBERTa_model_trained.md).

## Licensing information
This model is available under the [CC BY-NC 4.0.](https://creativecommons.org/licenses/by-nc/4.0/deed.es)

## Citation information

**Citing**: If you used RoBERTa+CelebA model in your work, please cite the paper publish in **[Information Processing and Management](https://doi.org/10.1016/j.ipm.2024.103667)**:

```bib
@article{YAURILOZANO2024103667,
title = {Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish},
journal = {Information Processing & Management},
volume = {61},
number = {3},
pages = {103667},
year = {2024},
issn = {0306-4573},
doi = {https://doi.org/10.1016/j.ipm.2024.103667},
url = {https://www.sciencedirect.com/science/article/pii/S030645732400027X},
author = {Eduardo Yauri-Lozano and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro}
}
```

## Autors
- [Eduardo Yauri Lozano](https://github.com/eduar03yauri)
- [Manuel Castillo-Cara](https://github.com/manwestc)
- [Raúl García-Castro](https://github.com/rgcmme)

[*Universidad Nacional de Ingeniería*](https://www.uni.edu.pe/), [*Ontology Engineering Group*](https://oeg.fi.upm.es/), [*Universidad Politécnica de Madrid.*](https://www.upm.es/internacional)

## Contributors
See the full list of contributors and more resources [here](https://github.com/eduar03yauri/DCGAN-text2face-forSpanish).

<kbd><img src="https://www.uni.edu.pe/images/logos/logo_uni_2016.png" alt="Universidad Politécnica de Madrid" width="100"></kbd>
<kbd><img src="https://raw.githubusercontent.com/oeg-upm/TINTO/main/assets/logo-oeg.png" alt="Ontology Engineering Group" width="100"></kbd> 
<kbd><img src="https://raw.githubusercontent.com/oeg-upm/TINTO/main/assets/logo-upm.png" alt="Universidad Politécnica de Madrid" width="100"></kbd>