jucendrero
commited on
Commit
·
df5ae33
1
Parent(s):
99efe8f
Updated model card
Browse files
README.md
CHANGED
@@ -1,36 +1,74 @@
|
|
1 |
---
|
|
|
|
|
2 |
tags:
|
3 |
- generated_from_trainer
|
|
|
4 |
model-index:
|
5 |
- name: gastronomia_para_to2
|
6 |
results: []
|
7 |
---
|
8 |
|
9 |
-
|
10 |
-
should probably proofread and complete it, then remove this comment. -->
|
11 |
|
12 |
-
|
13 |
-
|
14 |
-
This model is a fine-tuned version of [flax-community/gpt-2-spanish](https://huggingface.co/flax-community/gpt-2-spanish) on a custom dataset.
|
15 |
It achieves the following results on the evaluation set:
|
16 |
- Loss: 0.5796
|
17 |
|
18 |
-
##
|
19 |
|
20 |
- Julián Cendrero ([jucendrero](https://huggingface.co/jucendrero))
|
21 |
- Silvia Duque ([silBERTa](https://huggingface.co/silBERTa))
|
22 |
|
23 |
-
##
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
## Training procedure
|
36 |
|
@@ -66,3 +104,8 @@ The following hyperparameters were used during training:
|
|
66 |
- Pytorch 1.11.0+cu102
|
67 |
- Datasets 2.0.0
|
68 |
- Tokenizers 0.11.6
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- es
|
4 |
tags:
|
5 |
- generated_from_trainer
|
6 |
+
- recipe-generation
|
7 |
model-index:
|
8 |
- name: gastronomia_para_to2
|
9 |
results: []
|
10 |
---
|
11 |
|
12 |
+
# Model description
|
|
|
13 |
|
14 |
+
This model is a fine-tuned version of [flax-community/gpt-2-spanish](https://huggingface.co/flax-community/gpt-2-spanish) on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes.
|
|
|
|
|
15 |
It achieves the following results on the evaluation set:
|
16 |
- Loss: 0.5796
|
17 |
|
18 |
+
## Contributors
|
19 |
|
20 |
- Julián Cendrero ([jucendrero](https://huggingface.co/jucendrero))
|
21 |
- Silvia Duque ([silBERTa](https://huggingface.co/silBERTa))
|
22 |
|
23 |
+
## How to use it
|
24 |
+
|
25 |
+
```python
|
26 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
27 |
+
|
28 |
+
model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
|
29 |
+
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
|
30 |
+
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)
|
31 |
+
```
|
32 |
+
|
33 |
+
The tokenizer makes use of the following special tokens to indicate the structure of the recipe:
|
34 |
+
|
35 |
+
```python
|
36 |
+
special_tokens = [
|
37 |
+
'<INPUT_START>',
|
38 |
+
'<NEXT_INPUT>',
|
39 |
+
'<INPUT_END>',
|
40 |
+
'<TITLE_START>',
|
41 |
+
'<TITLE_END>',
|
42 |
+
'<INGR_START>',
|
43 |
+
'<NEXT_INGR>',
|
44 |
+
'<INGR_END>',
|
45 |
+
'<INSTR_START>',
|
46 |
+
'<NEXT_INSTR>',
|
47 |
+
'<INSTR_END>',
|
48 |
+
'<RECIPE_START>',
|
49 |
+
'<RECIPE_END>']
|
50 |
+
```
|
51 |
+
|
52 |
+
The input should be of the form:
|
53 |
+
|
54 |
+
```python
|
55 |
+
<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>
|
56 |
+
```
|
57 |
+
|
58 |
+
We are using the following configuration to generate recipes, but feel free to change parameters as needed:
|
59 |
+
|
60 |
+
```python
|
61 |
+
tokenized_input = tokenizer(input, return_tensors='pt')
|
62 |
+
output = model.generate(**tokenized_input,
|
63 |
+
max_length=600,
|
64 |
+
do_sample=True,
|
65 |
+
top_p=0.92,
|
66 |
+
top_k=50,
|
67 |
+
num_return_sequences=3)
|
68 |
+
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)
|
69 |
+
```
|
70 |
+
|
71 |
+
The recipe ends where the <RECIPE_END> special token appears for the first time.
|
72 |
|
73 |
## Training procedure
|
74 |
|
|
|
104 |
- Pytorch 1.11.0+cu102
|
105 |
- Datasets 2.0.0
|
106 |
- Tokenizers 0.11.6
|
107 |
+
|
108 |
+
## References
|
109 |
+
The list of special tokens used for generation recipe structure has been taken from:
|
110 |
+
[RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation](https://www.aclweb.org/anthology/2020.inlg-1.4.pdf).
|
111 |
+
|