vincentmin
/

llama-2-13b-reward-oasst1

Text Classification

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

vincentmin commited on Jul 27, 2023

Commit

4f94bf4

·

1 Parent(s): 37a71b7

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -22,6 +22,8 @@ It achieves the following results on the evaluation set:
 - Loss: 0.4810
 - Accuracy: 0.7869
 ## Model description
 This is a reward model trained with QLoRA in 4bit precision. The base model is [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) for which you need to have accepted the license in order to be able use it. Once you've been given permission, you can load the reward model as follows:
@@ -30,7 +32,7 @@ import torch
 from peft import PeftModel, PeftConfig
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
-peft_model_id = "vincentmin/llama-2-7b-reward-oasst1"
 config = PeftConfig.from_pretrained(peft_model_id)
 model = AutoModelForSequenceClassification.from_pretrained(
     config.base_model_name_or_path,

 - Loss: 0.4810
 - Accuracy: 0.7869
+See also [vincentmin/llama-2-7b-reward-oasst1](https://huggingface.co/vincentmin/llama-2-13b-reward-oasst1) for a 7b version of this model.
 ## Model description
 This is a reward model trained with QLoRA in 4bit precision. The base model is [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) for which you need to have accepted the license in order to be able use it. Once you've been given permission, you can load the reward model as follows:
 from peft import PeftModel, PeftConfig
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+peft_model_id = "vincentmin/llama-2-13b-reward-oasst1"
 config = PeftConfig.from_pretrained(peft_model_id)
 model = AutoModelForSequenceClassification.from_pretrained(
     config.base_model_name_or_path,