Triangle104 commited on
Commit
974a362
·
verified ·
1 Parent(s): 73414db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -13,6 +13,60 @@ tags:
13
  This model was converted to GGUF format from [`nbeerbower/mistral-nemo-narwhal-12B`](https://huggingface.co/nbeerbower/mistral-nemo-narwhal-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
14
  Refer to the [original model card](https://huggingface.co/nbeerbower/mistral-nemo-narwhal-12B) for more details on the model.
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ## Use with llama.cpp
17
  Install llama.cpp through brew (works on Mac and Linux)
18
 
 
13
  This model was converted to GGUF format from [`nbeerbower/mistral-nemo-narwhal-12B`](https://huggingface.co/nbeerbower/mistral-nemo-narwhal-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
14
  Refer to the [original model card](https://huggingface.co/nbeerbower/mistral-nemo-narwhal-12B) for more details on the model.
15
 
16
+ ---
17
+ Model details:
18
+ -
19
+ Mahou-1.5-mistral-nemo-12B-lorablated finetuned on reddit-dpo.
20
+ Method
21
+
22
+ ORPO tuned with 8x A100 for 1 epoch.
23
+
24
+ QLoRA config:
25
+
26
+ # QLoRA config
27
+ bnb_config = BitsAndBytesConfig(
28
+ load_in_4bit=True,
29
+ bnb_4bit_quant_type="nf4",
30
+ bnb_4bit_compute_dtype=torch_dtype,
31
+ bnb_4bit_use_double_quant=True,
32
+ )
33
+ # LoRA config
34
+ peft_config = LoraConfig(
35
+ r=16,
36
+ lora_alpha=32,
37
+ lora_dropout=0.05,
38
+ bias="none",
39
+ task_type="CAUSAL_LM",
40
+ target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
41
+ )
42
+
43
+ Training config:
44
+
45
+ orpo_args = ORPOConfig(
46
+ run_name=new_model,
47
+ learning_rate=8e-6,
48
+ lr_scheduler_type="linear",
49
+ max_length=2048,
50
+ max_prompt_length=1024,
51
+ max_completion_length=1024,
52
+ beta=0.1,
53
+ per_device_train_batch_size=4,
54
+ per_device_eval_batch_size=4,
55
+ gradient_accumulation_steps=1,
56
+ optim="paged_adamw_8bit",
57
+ num_train_epochs=2,
58
+ evaluation_strategy="steps",
59
+ eval_steps=0.2,
60
+ logging_steps=1,
61
+ warmup_steps=10,
62
+ max_grad_norm=10,
63
+ report_to="wandb",
64
+ output_dir="./results/",
65
+ bf16=True,
66
+ gradient_checkpointing=True,
67
+ )
68
+
69
+ ---
70
  ## Use with llama.cpp
71
  Install llama.cpp through brew (works on Mac and Linux)
72