Vivian12300 commited on
Commit
ae15dd4
·
verified ·
1 Parent(s): f66c387

End of training

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
  license: llama3.1
4
- base_model: Vivian12300/new_wiki_model
5
  tags:
6
  - trl
7
  - sft
@@ -9,16 +9,16 @@ tags:
9
  datasets:
10
  - generator
11
  model-index:
12
- - name: new_wiki_model_aya_sw
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # new_wiki_model_aya_sw
20
 
21
- This model is a fine-tuned version of [Vivian12300/new_wiki_model](https://huggingface.co/Vivian12300/new_wiki_model) on the generator dataset.
22
 
23
  ## Model description
24
 
@@ -46,7 +46,7 @@ The following hyperparameters were used during training:
46
  - optimizer: Use adafactor and the args are:
47
  No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
- - num_epochs: 30
50
 
51
  ### Training results
52
 
 
1
  ---
2
  library_name: transformers
3
  license: llama3.1
4
+ base_model: meta-llama/Llama-3.1-8B-Instruct
5
  tags:
6
  - trl
7
  - sft
 
9
  datasets:
10
  - generator
11
  model-index:
12
+ - name: sparse_ft_en_sw
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # sparse_ft_en_sw
20
 
21
+ This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the generator dataset.
22
 
23
  ## Model description
24
 
 
46
  - optimizer: Use adafactor and the args are:
47
  No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
+ - num_epochs: 5
50
 
51
  ### Training results
52