Update README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ This model is a fine-tuned version of TODO on [ReBatch/ultrafeedback_nl](https:/
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
-
This model is a Dutch chat model, originally developed from Mistral 7B v0.3 Instruct and further finetuned with QLoRA.
|
32 |
|
33 |
|
34 |
## Intended uses & limitations
|
@@ -56,7 +56,7 @@ Mistral-7B-v0.3-Instruct | 60.76 / 45.39 | 13.20 / 34.26 | 23.23 / 59.26 | 48.94
|
|
56 |
Finetuned by [Julien Van den Avenne](https://huggingface.co/vandeju)
|
57 |
|
58 |
|
59 |
-
|
60 |
|
61 |
The following hyperparameters were used during training:
|
62 |
- learning_rate: 5e-06
|
@@ -70,8 +70,8 @@ The following hyperparameters were used during training:
|
|
70 |
- lr_scheduler_type: cosine
|
71 |
- lr_scheduler_warmup_ratio: 0.1
|
72 |
- num_epochs: 1
|
73 |
-
|
74 |
-
|
75 |
|
76 |
- PEFT 0.11.1
|
77 |
- Transformers 4.41.2
|
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
+
This model is a Dutch chat model, originally developed from Mistral 7B v0.3 Instruct and further finetuned with QLoRA. First with SFT on a chat dataset and then with a DPO on a feedback Chat dataset.
|
32 |
|
33 |
|
34 |
## Intended uses & limitations
|
|
|
56 |
Finetuned by [Julien Van den Avenne](https://huggingface.co/vandeju)
|
57 |
|
58 |
|
59 |
+
## Training hyperparameters
|
60 |
|
61 |
The following hyperparameters were used during training:
|
62 |
- learning_rate: 5e-06
|
|
|
70 |
- lr_scheduler_type: cosine
|
71 |
- lr_scheduler_warmup_ratio: 0.1
|
72 |
- num_epochs: 1
|
73 |
+
|
74 |
+
## Framework versions
|
75 |
|
76 |
- PEFT 0.11.1
|
77 |
- Transformers 4.41.2
|