Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ language: ['en', 'es', 'pt']
|
|
16 |
This model is a fine-tuned version of [Weni/ZeroShot-3.3.14-Mistral-7b-Multilanguage-3.2.0-merged] on the dataset Weni/zeroshot-dpo-1.0.0 with the DPO trainer. It is part of the ZeroShot project for [Weni](https://weni.ai/).
|
17 |
|
18 |
It achieves the following results on the evaluation set:
|
19 |
-
{'eval_loss': 0.
|
20 |
|
21 |
## Intended uses & limitations
|
22 |
|
@@ -72,14 +72,14 @@ Rejected_response:
|
|
72 |
|
73 |
The following hyperparameters were used during training:
|
74 |
- learning_rate: 0.0002
|
75 |
-
- per_device_train_batch_size:
|
76 |
-
- per_device_eval_batch_size:
|
77 |
- gradient_accumulation_steps: 4
|
78 |
- num_gpus: 1
|
79 |
-
- total_train_batch_size:
|
80 |
- optimizer: AdamW
|
81 |
- lr_scheduler_type: cosine
|
82 |
-
- num_steps:
|
83 |
- quantization_type: bitsandbytes
|
84 |
- LoRA: ("\n - bits: 4\n - use_exllama: True\n - device_map: auto\n - use_cache: False\n - lora_r: 8\n - lora_alpha: 16\n - lora_dropout: 0.1\n - bias: none\n - target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']\n - task_type: CAUSAL_LM",)
|
85 |
|
|
|
16 |
This model is a fine-tuned version of [Weni/ZeroShot-3.3.14-Mistral-7b-Multilanguage-3.2.0-merged] on the dataset Weni/zeroshot-dpo-1.0.0 with the DPO trainer. It is part of the ZeroShot project for [Weni](https://weni.ai/).
|
17 |
|
18 |
It achieves the following results on the evaluation set:
|
19 |
+
{'eval_loss': 0.5391563177108765, 'eval_runtime': 23.7839, 'eval_samples_per_second': 2.565, 'eval_steps_per_second': 1.303, 'eval_rewards/chosen': -4.273996829986572, 'eval_rewards/rejected': -11.652483940124512, 'eval_rewards/accuracies': 0.8870967626571655, 'eval_rewards/margins': 7.378485679626465, 'eval_logps/rejected': -25.808551788330078, 'eval_logps/chosen': -20.536710739135742, 'eval_logits/rejected': -1.4332084655761719, 'eval_logits/chosen': -1.4393092393875122, 'epoch': 0.99}
|
20 |
|
21 |
## Intended uses & limitations
|
22 |
|
|
|
72 |
|
73 |
The following hyperparameters were used during training:
|
74 |
- learning_rate: 0.0002
|
75 |
+
- per_device_train_batch_size: 2
|
76 |
+
- per_device_eval_batch_size: 2
|
77 |
- gradient_accumulation_steps: 4
|
78 |
- num_gpus: 1
|
79 |
+
- total_train_batch_size: 8
|
80 |
- optimizer: AdamW
|
81 |
- lr_scheduler_type: cosine
|
82 |
+
- num_steps: 67
|
83 |
- quantization_type: bitsandbytes
|
84 |
- LoRA: ("\n - bits: 4\n - use_exllama: True\n - device_map: auto\n - use_cache: False\n - lora_r: 8\n - lora_alpha: 16\n - lora_dropout: 0.1\n - bias: none\n - target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']\n - task_type: CAUSAL_LM",)
|
85 |
|