File size: 3,631 Bytes
88bd899 df5ae33 88bd899 df5ae33 71f14c3 ab5bc89 b326837 ad6755a ab5bc89 88bd899 df5ae33 88bd899 df5ae33 88bd899 df5ae33 99efe8f df5ae33 32bbf43 88bd899 df5ae33 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
---
language:
- es
tags:
- generated_from_trainer
- recipe-generation
widget:
- text: "<RECIPE_START> <INPUT_START> salmón <NEXT_INPUT> zumo de naranja <NEXT_INPUT> aceite de oliva <NEXT_INPUT> sal <NEXT_INPUT> pimienta <INPUT_END> <INGR_START>"
- text: "<RECIPE_START> <INPUT_START> harina <NEXT_INPUT> azúcar <NEXT_INPUT> huevos <NEXT_INPUT> chocolate <NEXT_INPUT> levadura Royal <INPUT_END> <INGR_START>"
inference:
parameters:
top_k: 50
top_p: 0.92
do_sample: True
num_return_sequences: 3
max_new_tokens: 100
---
# Model description
This model is a fine-tuned version of [flax-community/gpt-2-spanish](https://huggingface.co/flax-community/gpt-2-spanish) on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes.
It achieves the following results on the evaluation set:
- Loss: 0.5796
## Contributors
- Julián Cendrero ([jucendrero](https://huggingface.co/jucendrero))
- Silvia Duque ([silBERTa](https://huggingface.co/silBERTa))
## How to use it
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)
```
The tokenizer makes use of the following special tokens to indicate the structure of the recipe:
```python
special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']
```
The input should be of the form:
```python
<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>
```
We are using the following configuration to generate recipes, but feel free to change parameters as needed:
```python
tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
max_length=600,
do_sample=True,
top_p=0.92,
top_k=50,
num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)
```
The recipe ends where the \<RECIPE_END\> special token appears for the first time.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 0.6213 | 1.0 | 5897 | 0.6214 |
| 0.5905 | 2.0 | 11794 | 0.5995 |
| 0.5777 | 3.0 | 17691 | 0.5893 |
| 0.574 | 4.0 | 23588 | 0.5837 |
| 0.5553 | 5.0 | 29485 | 0.5807 |
| 0.5647 | 6.0 | 35382 | 0.5796 |
### Framework versions
- Transformers 4.17.0
- Pytorch 1.11.0+cu102
- Datasets 2.0.0
- Tokenizers 0.11.6
## References
The list of special tokens used for generation recipe structure has been taken from:
[RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation](https://www.aclweb.org/anthology/2020.inlg-1.4.pdf).
|