gsmyrnis commited on
Commit
044a00a
·
verified ·
1 Parent(s): a0d6420

Model save

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -4,7 +4,6 @@ license: apache-2.0
4
  base_model: Qwen/Qwen2.5-7B-Instruct
5
  tags:
6
  - llama-factory
7
- - full
8
  - generated_from_trainer
9
  model-index:
10
  - name: llama3-1_8b_4o_annotated_aops
@@ -16,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # llama3-1_8b_4o_annotated_aops
18
 
19
- This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the mlfoundations-dev/4o_annotated_aops dataset.
20
 
21
  ## Model description
22
 
@@ -36,11 +35,12 @@ More information needed
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 1e-05
39
- - train_batch_size: 3
40
  - eval_batch_size: 8
41
  - seed: 42
42
  - distributed_type: multi-GPU
43
  - num_devices: 32
 
44
  - total_train_batch_size: 96
45
  - total_eval_batch_size: 256
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
4
  base_model: Qwen/Qwen2.5-7B-Instruct
5
  tags:
6
  - llama-factory
 
7
  - generated_from_trainer
8
  model-index:
9
  - name: llama3-1_8b_4o_annotated_aops
 
15
 
16
  # llama3-1_8b_4o_annotated_aops
17
 
18
+ This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on an unknown dataset.
19
 
20
  ## Model description
21
 
 
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 1e-05
38
+ - train_batch_size: 1
39
  - eval_batch_size: 8
40
  - seed: 42
41
  - distributed_type: multi-GPU
42
  - num_devices: 32
43
+ - gradient_accumulation_steps: 3
44
  - total_train_batch_size: 96
45
  - total_eval_batch_size: 256
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments