vandeju commited on
Commit
f325fbf
1 Parent(s): 50980f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -2
README.md CHANGED
@@ -39,7 +39,7 @@ Use with Mistral's chat template (can be found in the tokenizer)
39
  ## Training procedure
40
 
41
 
42
- This model was trained with LoRa in bfloat16 with flash attention 2 on 8xA100 SXM with DeepSpeed ZeRO-3; with the DPO script from the [alignment handbook](https://github.com/huggingface/alignment-handbook/) on RunPod.
43
 
44
  ## Evaluation results
45
 
@@ -53,4 +53,28 @@ Mistral-7B-v0.3-Instruct | 60.76 / 45.39 | 13.20 / 34.26 | 23.23 / 59.26 | 48.94
53
 
54
  ## Model Developer
55
 
56
- Finetuned by [Julien Van den Avenne](https://huggingface.co/vandeju)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ## Training procedure
40
 
41
 
42
+ This model was trained with QLoRa in bfloat16 with flash attention 2 on oen A100 PCIe; with the DPO script from the [alignment handbook](https://github.com/huggingface/alignment-handbook/) on [RunPod](https://www.runpod.io/).
43
 
44
  ## Evaluation results
45
 
 
53
 
54
  ## Model Developer
55
 
56
+ Finetuned by [Julien Van den Avenne](https://huggingface.co/vandeju)
57
+
58
+
59
+ ### Training hyperparameters
60
+
61
+ The following hyperparameters were used during training:
62
+ - learning_rate: 5e-06
63
+ - train_batch_size: 3
64
+ - eval_batch_size: 8
65
+ - seed: 42
66
+ - distributed_type: multi-GPU
67
+ - gradient_accumulation_steps: 2
68
+ - total_train_batch_size: 6
69
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
70
+ - lr_scheduler_type: cosine
71
+ - lr_scheduler_warmup_ratio: 0.1
72
+ - num_epochs: 1
73
+ -
74
+ ### Framework versions
75
+
76
+ - PEFT 0.11.1
77
+ - Transformers 4.41.2
78
+ - Pytorch 2.2.0+cu121
79
+ - Datasets 2.19.1
80
+ - Tokenizers 0.19.1