PyTorch
llama
alignment-handbook
Generated from Trainer
Junxiong Wang commited on
Commit
846f1e3
·
1 Parent(s): d350c32

add models

Browse files
Files changed (2) hide show
  1. README.md +2 -2
  2. configs.yaml +3 -3
README.md CHANGED
@@ -8,7 +8,7 @@ datasets:
8
  - HuggingFaceH4/orca_dpo_pairs
9
  - JunxiongWang/llama3-ultrafeedback-armorm
10
  model-index:
11
- - name: JunxiongWang/MambaInLlama_0_75
12
  results: []
13
  ---
14
 
@@ -16,7 +16,7 @@ model-index:
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/junxiong12/huggingface/runs/d364ppka)
19
- # JunxiongWang/MambaInLlama_0_75
20
 
21
  This model is a fine-tuned version of [JunxiongWang/llama3_mamba_0_5_sft](https://huggingface.co/JunxiongWang/llama3_mamba_0_5_sft) on the HuggingFaceH4/ultrafeedback_binarized, the HuggingFaceH4/orca_dpo_pairs and the JunxiongWang/llama3-ultrafeedback-armorm datasets.
22
  It achieves the following results on the evaluation set:
 
8
  - HuggingFaceH4/orca_dpo_pairs
9
  - JunxiongWang/llama3-ultrafeedback-armorm
10
  model-index:
11
+ - name: JunxiongWang/MambaInLlama_0_875
12
  results: []
13
  ---
14
 
 
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/junxiong12/huggingface/runs/d364ppka)
19
+ # JunxiongWang/MambaInLlama_0_875
20
 
21
  This model is a fine-tuned version of [JunxiongWang/llama3_mamba_0_5_sft](https://huggingface.co/JunxiongWang/llama3_mamba_0_5_sft) on the HuggingFaceH4/ultrafeedback_binarized, the HuggingFaceH4/orca_dpo_pairs and the JunxiongWang/llama3-ultrafeedback-armorm datasets.
22
  It achieves the following results on the evaluation set:
configs.yaml CHANGED
@@ -2,12 +2,12 @@ llama3_mamba_0_875_sft_3dataset_ep1:
2
  prompt_template: "zephyr-7b-alpha/prompt.txt"
3
  fn_completions: "huggingface_local_completions"
4
  completions_kwargs:
5
- model_name: "/data/junxiong/sft/dpo/llama3_mamba_0_875_sft_3dataset_ep1/"
6
  model_kwargs:
7
  torch_dtype: 'bfloat16'
8
  max_new_tokens: 2048
9
  temperature: 0.7
10
  top_p: 1.0
11
  do_sample: True
12
- pretty_name: "Mamba 0 5 From Zephyr 7B Beta"
13
- link: "https://huggingface.co/HuggingFaceH4/zephyr-7b-beta"
 
2
  prompt_template: "zephyr-7b-alpha/prompt.txt"
3
  fn_completions: "huggingface_local_completions"
4
  completions_kwargs:
5
+ model_name: "JunxiongWang/MambaInLlama_0_875"
6
  model_kwargs:
7
  torch_dtype: 'bfloat16'
8
  max_new_tokens: 2048
9
  temperature: 0.7
10
  top_p: 1.0
11
  do_sample: True
12
+ pretty_name: "Mamba 0 875 From meta-llama/Meta-Llama-3-8B-Instruct"
13
+ link: "https://huggingface.co/JunxiongWang/MambaInLlama_0_875"