billmianz commited on
Commit
e7def3e
·
verified ·
1 Parent(s): 6ca9fcc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ base_model:
8
  - meta-llama/Llama-3.1-8B-Instruct
9
  ---
10
 
11
- The reward model presented in the paper [Preference Learning Unlocks LLMs' Psycho-Counseling Skills](https://hf.co/papers/2502.19731). It's a fine-tuned Llama 3 model trained using preference learning on the [PsychoCounsel-Preference](https://huggingface.co/datasets/Psychotherapy-LLM/PsychoCounsel-Preference) dataset.
12
  This policy model, [PsychoCounsel-Llama3-8B](https://huggingface.co/Psychotherapy-LLM/PsychoCounsel-Llama3-8B), trained with this model with online preference learning, achieves an impressive win rate of 87% against GPT-4o in psycho-counseling tasks.
13
 
14
 
 
8
  - meta-llama/Llama-3.1-8B-Instruct
9
  ---
10
 
11
+ The reward model presented in the paper [Preference Learning Unlocks LLMs' Psycho-Counseling Skills](https://hf.co/papers/2502.19731). It's a fine-tuned [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model trained using preference learning on the [PsychoCounsel-Preference](https://huggingface.co/datasets/Psychotherapy-LLM/PsychoCounsel-Preference) dataset.
12
  This policy model, [PsychoCounsel-Llama3-8B](https://huggingface.co/Psychotherapy-LLM/PsychoCounsel-Llama3-8B), trained with this model with online preference learning, achieves an impressive win rate of 87% against GPT-4o in psycho-counseling tasks.
13
 
14