Vivian12300
/

sparse_ft_en_sw

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

Vivian12300 commited on Dec 11, 2024

Commit

ae15dd4

·

verified ·

1 Parent(s): f66c387

End of training

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: transformers
 license: llama3.1
-base_model: Vivian12300/new_wiki_model
 tags:
 - trl
 - sft
@@ -9,16 +9,16 @@ tags:
 datasets:
 - generator
 model-index:
-- name: new_wiki_model_aya_sw
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# new_wiki_model_aya_sw
-This model is a fine-tuned version of [Vivian12300/new_wiki_model](https://huggingface.co/Vivian12300/new_wiki_model) on the generator dataset.
 ## Model description
@@ -46,7 +46,7 @@ The following hyperparameters were used during training:
 - optimizer: Use adafactor and the args are:
 No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 30
 ### Training results

 ---
 library_name: transformers
 license: llama3.1
+base_model: meta-llama/Llama-3.1-8B-Instruct
 tags:
 - trl
 - sft
 datasets:
 - generator
 model-index:
+- name: sparse_ft_en_sw
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# sparse_ft_en_sw
+This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the generator dataset.
 ## Model description
 - optimizer: Use adafactor and the args are:
 No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 5
 ### Training results