vincentmin commited on
Commit
f0db376
·
1 Parent(s): 9877e68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -20,6 +20,9 @@ It achieves the following results on the evaluation set:
20
  - Loss: 0.5713
21
  - Accuracy: 0.7435
22
 
 
 
 
23
  ## Model description
24
 
25
  This is a reward model trained with QLoRA in 4bit precision. The base model is [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) for which you need to have accepted the license in order to be able use it. Once you've been given permission, you can load the reward model as follows:
 
20
  - Loss: 0.5713
21
  - Accuracy: 0.7435
22
 
23
+ See also [vincentmin/llama-2-13b-reward-oasst1](https://huggingface.co/vincentmin/llama-2-13b-reward-oasst1) for a 13b version of this model.
24
+
25
+
26
  ## Model description
27
 
28
  This is a reward model trained with QLoRA in 4bit precision. The base model is [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) for which you need to have accepted the license in order to be able use it. Once you've been given permission, you can load the reward model as follows: