JunxiongWang
/

MambaInLlama_0_875

alignment-handbook

Generated from Trainer

Model card Files Files and versions Community

Junxiong Wang commited on Aug 22, 2024

Commit

846f1e3

·

1 Parent(s): d350c32

add models

Files changed (2) hide show

README.md +2 -2
configs.yaml +3 -3

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ datasets:
 - HuggingFaceH4/orca_dpo_pairs
 - JunxiongWang/llama3-ultrafeedback-armorm
 model-index:
-- name: JunxiongWang/MambaInLlama_0_75
   results: []
 ---
@@ -16,7 +16,7 @@ model-index:
 should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/junxiong12/huggingface/runs/d364ppka)
-# JunxiongWang/MambaInLlama_0_75
 This model is a fine-tuned version of [JunxiongWang/llama3_mamba_0_5_sft](https://huggingface.co/JunxiongWang/llama3_mamba_0_5_sft) on the HuggingFaceH4/ultrafeedback_binarized, the HuggingFaceH4/orca_dpo_pairs and the JunxiongWang/llama3-ultrafeedback-armorm datasets.
 It achieves the following results on the evaluation set:

 - HuggingFaceH4/orca_dpo_pairs
 - JunxiongWang/llama3-ultrafeedback-armorm
 model-index:
+- name: JunxiongWang/MambaInLlama_0_875
   results: []
 ---
 should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/junxiong12/huggingface/runs/d364ppka)
+# JunxiongWang/MambaInLlama_0_875
 This model is a fine-tuned version of [JunxiongWang/llama3_mamba_0_5_sft](https://huggingface.co/JunxiongWang/llama3_mamba_0_5_sft) on the HuggingFaceH4/ultrafeedback_binarized, the HuggingFaceH4/orca_dpo_pairs and the JunxiongWang/llama3-ultrafeedback-armorm datasets.
 It achieves the following results on the evaluation set:

configs.yaml CHANGED Viewed

@@ -2,12 +2,12 @@ llama3_mamba_0_875_sft_3dataset_ep1:
   prompt_template: "zephyr-7b-alpha/prompt.txt"
   fn_completions: "huggingface_local_completions"
   completions_kwargs:
-    model_name: "/data/junxiong/sft/dpo/llama3_mamba_0_875_sft_3dataset_ep1/"
     model_kwargs:
       torch_dtype: 'bfloat16'
     max_new_tokens: 2048
     temperature: 0.7
     top_p: 1.0
     do_sample: True
-  pretty_name: "Mamba 0 5 From Zephyr 7B Beta"
-  link: "https://huggingface.co/HuggingFaceH4/zephyr-7b-beta"

   prompt_template: "zephyr-7b-alpha/prompt.txt"
   fn_completions: "huggingface_local_completions"
   completions_kwargs:
+    model_name: "JunxiongWang/MambaInLlama_0_875"
     model_kwargs:
       torch_dtype: 'bfloat16'
     max_new_tokens: 2048
     temperature: 0.7
     top_p: 1.0
     do_sample: True
+  pretty_name: "Mamba 0 875 From meta-llama/Meta-Llama-3-8B-Instruct"
+  link: "https://huggingface.co/JunxiongWang/MambaInLlama_0_875"