beamaia commited on
Commit
b6c4e6a
·
verified ·
1 Parent(s): 99eab53

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -16,7 +16,7 @@ language: ['en', 'es', 'pt']
16
  This model is a fine-tuned version of [Weni/ZeroShot-3.3.14-Mistral-7b-Multilanguage-3.2.0-merged] on the dataset Weni/zeroshot-dpo-1.0.0 with the DPO trainer. It is part of the ZeroShot project for [Weni](https://weni.ai/).
17
 
18
  It achieves the following results on the evaluation set:
19
- {'eval_loss': 0.11184482276439667, 'eval_runtime': 26.2705, 'eval_samples_per_second': 2.322, 'eval_steps_per_second': 0.305, 'eval_rewards/chosen': 5.812995433807373, 'eval_rewards/rejected': -2.4983203411102295, 'eval_rewards/accuracies': 0.9437500238418579, 'eval_rewards/margins': 8.311315536499023, 'eval_logps/rejected': -16.0378475189209, 'eval_logps/chosen': -10.56441879272461, 'eval_logits/rejected': -1.2986871004104614, 'eval_logits/chosen': -1.3477466106414795, 'epoch': 0.94}
20
 
21
  ## Intended uses & limitations
22
 
@@ -72,14 +72,14 @@ Rejected_response:
72
 
73
  The following hyperparameters were used during training:
74
  - learning_rate: 0.0002
75
- - per_device_train_batch_size: 8
76
- - per_device_eval_batch_size: 8
77
  - gradient_accumulation_steps: 4
78
  - num_gpus: 1
79
- - total_train_batch_size: 32
80
  - optimizer: AdamW
81
  - lr_scheduler_type: cosine
82
- - num_steps: 16
83
  - quantization_type: bitsandbytes
84
  - LoRA: ("\n - bits: 4\n - use_exllama: True\n - device_map: auto\n - use_cache: False\n - lora_r: 8\n - lora_alpha: 16\n - lora_dropout: 0.1\n - bias: none\n - target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']\n - task_type: CAUSAL_LM",)
85
 
 
16
  This model is a fine-tuned version of [Weni/ZeroShot-3.3.14-Mistral-7b-Multilanguage-3.2.0-merged] on the dataset Weni/zeroshot-dpo-1.0.0 with the DPO trainer. It is part of the ZeroShot project for [Weni](https://weni.ai/).
17
 
18
  It achieves the following results on the evaluation set:
19
+ {'eval_loss': 0.5391563177108765, 'eval_runtime': 23.7839, 'eval_samples_per_second': 2.565, 'eval_steps_per_second': 1.303, 'eval_rewards/chosen': -4.273996829986572, 'eval_rewards/rejected': -11.652483940124512, 'eval_rewards/accuracies': 0.8870967626571655, 'eval_rewards/margins': 7.378485679626465, 'eval_logps/rejected': -25.808551788330078, 'eval_logps/chosen': -20.536710739135742, 'eval_logits/rejected': -1.4332084655761719, 'eval_logits/chosen': -1.4393092393875122, 'epoch': 0.99}
20
 
21
  ## Intended uses & limitations
22
 
 
72
 
73
  The following hyperparameters were used during training:
74
  - learning_rate: 0.0002
75
+ - per_device_train_batch_size: 2
76
+ - per_device_eval_batch_size: 2
77
  - gradient_accumulation_steps: 4
78
  - num_gpus: 1
79
+ - total_train_batch_size: 8
80
  - optimizer: AdamW
81
  - lr_scheduler_type: cosine
82
+ - num_steps: 67
83
  - quantization_type: bitsandbytes
84
  - LoRA: ("\n - bits: 4\n - use_exllama: True\n - device_map: auto\n - use_cache: False\n - lora_r: 8\n - lora_alpha: 16\n - lora_dropout: 0.1\n - bias: none\n - target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']\n - task_type: CAUSAL_LM",)
85