trollek commited on
Commit
2aa5b45
1 Parent(s): 2b7e690

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -8,6 +8,8 @@ base_model: h2oai/h2o-danube3-4b-base
8
  ---
9
  # LittleInstructionJudge-4B-v0.1
10
 
 
 
11
  A BAdam fine-tuned danube3-4b-base to do one thing, and one thing only: Being a lightweight LLM-as-a-Judge for instruction prompts.
12
 
13
  The purpose of training this model is to have a small language model that can filter away the worst offenders when creating datasets using the Magpie method in hardware constrained environments.
@@ -33,6 +35,10 @@ This is the instruction I need you to judge:
33
  {{instruction}}
34
  ```
35
 
 
 
 
 
36
  ### LLama-Factory training config
37
 
38
  ```yaml
 
8
  ---
9
  # LittleInstructionJudge-4B-v0.1
10
 
11
+ **Update:** The instruct_reward is all out of wack due to a misunderstanding on my part caused by lazyness. The other values are fine, though not as useful if I had actually just read more. Any model with the right prompt is better. Even [CleverQwen2-1.5B](https://huggingface.co/trollek/CleverQwen2-1.5B). The next version will be better.
12
+
13
  A BAdam fine-tuned danube3-4b-base to do one thing, and one thing only: Being a lightweight LLM-as-a-Judge for instruction prompts.
14
 
15
  The purpose of training this model is to have a small language model that can filter away the worst offenders when creating datasets using the Magpie method in hardware constrained environments.
 
35
  {{instruction}}
36
  ```
37
 
38
+ ### Quants
39
+
40
+ * [mradermacher/LittleInstructionJudge-4B-v0.1-GGUF](https://huggingface.co/mradermacher/LittleInstructionJudge-4B-v0.1-GGUF)
41
+
42
  ### LLama-Factory training config
43
 
44
  ```yaml