ReBatch
/

Reynaerde-7B-Chat

alignment-handbook

Model card Files Files and versions Community

vandeju commited on Jun 6, 2024

Commit

1aa2fb2

·

verified ·

1 Parent(s): fa5ccf0

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ This model is a fine-tuned version of TODO on [ReBatch/ultrafeedback_nl](https:/
 ## Model description
-This model is a Dutch chat model, originally developed from Mistral 7B v0.3 Instruct and further finetuned with QLoRA. first with SFT on a chat dataset and then with a DPO on a feedback Chat dataset.
 ## Intended uses & limitations
@@ -56,7 +56,7 @@ Mistral-7B-v0.3-Instruct | 60.76 / 45.39 | 13.20 / 34.26 | 23.23 / 59.26 | 48.94
 Finetuned by [Julien Van den Avenne](https://huggingface.co/vandeju)
-### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-06
@@ -70,8 +70,8 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 1
--
-### Framework versions
 - PEFT 0.11.1
 - Transformers 4.41.2

 ## Model description
+This model is a Dutch chat model, originally developed from Mistral 7B v0.3 Instruct and further finetuned with QLoRA. First with SFT on a chat dataset and then with a DPO on a feedback Chat dataset.
 ## Intended uses & limitations
 Finetuned by [Julien Van den Avenne](https://huggingface.co/vandeju)
+## Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-06
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 1
+## Framework versions
 - PEFT 0.11.1
 - Transformers 4.41.2