Update README.md
Browse files
README.md
CHANGED
@@ -11,26 +11,44 @@ tags:
|
|
11 |
- celebFaces Attributes
|
12 |
---
|
13 |
# RoBERTa base BNE trained with data from the descriptive text corpus of the CelebA dataset
|
|
|
14 |
## Overview
|
|
|
15 |
- **Language**: Spanish
|
16 |
- **Data**: [CelebA_RoBERTa_Sp](https://huggingface.co/datasets/oeg/CelebA_RoBERTa_Sp).
|
17 |
- **Architecture**: roberta-base
|
18 |
|
19 |
## Description
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
## How to use
|
22 |
|
|
|
23 |
## Licensing information
|
24 |
-
This
|
|
|
25 |
## Citation information
|
26 |
If you used the model Roberta_CelebA_Sp in your work, please cite [this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/):
|
|
|
27 |
## Autors
|
28 |
- [Eduardo Yauri Lozano](https://github.com/eduar03yauri)
|
29 |
- [Manuel Castillo-Cara](https://github.com/manwestc)
|
|
|
30 |
|
31 |
[*Universidad Nacional de Ingeniería*](https://www.uni.edu.pe/), [*Ontology Engineering Group*](https://oeg.fi.upm.es/), [*Universidad Politécnica de Madrid.*](https://www.upm.es/internacional)
|
32 |
|
33 |
## Contributors
|
34 |
See the full list of contributors [here](https://github.com/eduar03yauri/DCGAN-text2face-forSpanishs).
|
35 |
|
36 |
-
|
|
|
|
|
|
|
|
11 |
- celebFaces Attributes
|
12 |
---
|
13 |
# RoBERTa base BNE trained with data from the descriptive text corpus of the CelebA dataset
|
14 |
+
|
15 |
## Overview
|
16 |
+
|
17 |
- **Language**: Spanish
|
18 |
- **Data**: [CelebA_RoBERTa_Sp](https://huggingface.co/datasets/oeg/CelebA_RoBERTa_Sp).
|
19 |
- **Architecture**: roberta-base
|
20 |
|
21 |
## Description
|
22 |
+
In order to improve the RoBERTa encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/)
|
23 |
+
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:
|
24 |
+
- Define sentence-transformer and torch libraries for the implementation of the encoder.
|
25 |
+
- Divide the training corpus into two parts, training with 249,999 sentences and validation with 10,000 sentences.
|
26 |
+
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them,
|
27 |
+
the entries are composed of a pair of descriptive sentences and their similarity value.
|
28 |
+
- Implement RoBERTa as a baseline model for transformer training.
|
29 |
+
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding
|
30 |
+
- vectors _u_ and _v_ generated using the cosine similarity metric (_CosineSimilarityLoss()_) are evaluated.
|
31 |
|
32 |
## How to use
|
33 |
|
34 |
+
|
35 |
## Licensing information
|
36 |
+
This model is available under the [Apache License 2.0.](https://www.apache.org/licenses/LICENSE-2.0)
|
37 |
+
|
38 |
## Citation information
|
39 |
If you used the model Roberta_CelebA_Sp in your work, please cite [this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/):
|
40 |
+
|
41 |
## Autors
|
42 |
- [Eduardo Yauri Lozano](https://github.com/eduar03yauri)
|
43 |
- [Manuel Castillo-Cara](https://github.com/manwestc)
|
44 |
+
- [Raúl García-Castro](https://github.com/rgcmme)
|
45 |
|
46 |
[*Universidad Nacional de Ingeniería*](https://www.uni.edu.pe/), [*Ontology Engineering Group*](https://oeg.fi.upm.es/), [*Universidad Politécnica de Madrid.*](https://www.upm.es/internacional)
|
47 |
|
48 |
## Contributors
|
49 |
See the full list of contributors [here](https://github.com/eduar03yauri/DCGAN-text2face-forSpanishs).
|
50 |
|
51 |
+
|
52 |
+
<kbd><img src="https://raw.githubusercontent.com/oeg-upm/TINTO/main/assets/logo-oeg.png" alt="Ontology Engineering Group" width="100"></kbd>
|
53 |
+
<kbd><img src="https://raw.githubusercontent.com/oeg-upm/TINTO/main/assets/logo-upm.png" alt="Universidad Politécnica de Madrid" width="100"></kbd>
|
54 |
+
<kbd><img src="https://www.uni.edu.pe/images/logos/logo_uni_2016.png" alt="Universidad Politécnica de Madrid" width="200"></kbd>
|