|
--- |
|
base_model: yhavinga/ul2-large-dutch |
|
library_name: peft |
|
license: apache-2.0 |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: ul2-large-dutch-finetuned-oba-book-search |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# ul2-large-dutch-finetuned-oba-book-search |
|
|
|
This model is a fine-tuned version of [yhavinga/ul2-large-dutch](https://huggingface.co/yhavinga/ul2-large-dutch) on the None dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 4.1161 |
|
- Top-5-accuracy: 4.1679 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.3 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 10 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Top-5-accuracy | |
|
|:-------------:|:------:|:----:|:---------------:|:--------------:| |
|
| 6.2541 | 0.2577 | 200 | 4.6137 | 0.0579 | |
|
| 5.8635 | 0.5155 | 400 | 4.5076 | 0.1158 | |
|
| 5.5301 | 0.7732 | 600 | 4.4350 | 0.1447 | |
|
| 5.5298 | 1.0309 | 800 | 4.4449 | 0.1447 | |
|
| 5.3296 | 1.2887 | 1000 | 4.4621 | 0.1158 | |
|
| 5.3336 | 1.5464 | 1200 | 4.4232 | 0.1447 | |
|
| 5.2192 | 1.8041 | 1400 | 4.3842 | 0.1447 | |
|
| 5.2348 | 2.0619 | 1600 | 4.3465 | 0.1447 | |
|
| 5.0988 | 2.3196 | 1800 | 4.3129 | 0.2026 | |
|
| 5.1633 | 2.5773 | 2000 | 4.3007 | 0.1737 | |
|
| 5.1103 | 2.8351 | 2200 | 4.2722 | 0.2026 | |
|
| 5.0057 | 3.0928 | 2400 | 4.3158 | 0.1447 | |
|
| 5.0554 | 3.3505 | 2600 | 4.2731 | 0.4342 | |
|
| 4.9774 | 3.6082 | 2800 | 4.2467 | 0.3763 | |
|
| 4.9769 | 3.8660 | 3000 | 4.2320 | 0.5789 | |
|
| 4.9825 | 4.1237 | 3200 | 4.2115 | 0.8394 | |
|
| 4.9692 | 4.3814 | 3400 | 4.2172 | 1.3893 | |
|
| 4.9681 | 4.6392 | 3600 | 4.2093 | 1.5630 | |
|
| 4.8661 | 4.8969 | 3800 | 4.2003 | 2.2865 | |
|
| 4.942 | 5.1546 | 4000 | 4.2047 | 2.3734 | |
|
| 4.8974 | 5.4124 | 4200 | 4.1583 | 2.8654 | |
|
| 4.8827 | 5.6701 | 4400 | 4.1852 | 2.9522 | |
|
| 4.8705 | 5.9278 | 4600 | 4.1661 | 3.4732 | |
|
| 4.8714 | 6.1856 | 4800 | 4.1478 | 3.7916 | |
|
| 4.7909 | 6.4433 | 5000 | 4.1748 | 3.6179 | |
|
| 4.8357 | 6.7010 | 5200 | 4.1471 | 3.9074 | |
|
| 4.8723 | 6.9588 | 5400 | 4.1518 | 4.0232 | |
|
| 4.8838 | 7.2165 | 5600 | 4.1428 | 4.1389 | |
|
| 4.804 | 7.4742 | 5800 | 4.1468 | 4.0232 | |
|
| 4.8232 | 7.7320 | 6000 | 4.1390 | 4.1389 | |
|
| 4.8571 | 7.9897 | 6200 | 4.1305 | 4.0810 | |
|
| 4.7454 | 8.2474 | 6400 | 4.1297 | 4.1679 | |
|
| 4.8652 | 8.5052 | 6600 | 4.1262 | 4.1968 | |
|
| 4.7882 | 8.7629 | 6800 | 4.1227 | 4.1679 | |
|
| 4.8025 | 9.0206 | 7000 | 4.1134 | 4.1679 | |
|
| 4.8124 | 9.2784 | 7200 | 4.1211 | 4.1389 | |
|
| 4.7157 | 9.5361 | 7400 | 4.1122 | 4.1389 | |
|
| 4.8666 | 9.7938 | 7600 | 4.1161 | 4.1679 | |
|
|
|
|
|
### Framework versions |
|
|
|
- PEFT 0.11.0 |
|
- Transformers 4.44.2 |
|
- Pytorch 1.13.0+cu116 |
|
- Datasets 3.0.0 |
|
- Tokenizers 0.19.1 |