Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ language:
|
|
11 |
- **Model Size:** 0.5B parameters
|
12 |
- **Model Type:** Instruction-following Language Model
|
13 |
- **Training Data**: About 700 high quality preference entries annotated by GPT-4.
|
14 |
-
- **Training Procedure**: The DPO-Positive algorithm introduced abacusai was used to train this model.
|
15 |
|
16 |
## Model Use
|
17 |
tau-instruct-0.5B-DPOP is an instruction-following language model designed to follow user instructions and provide assistance across a wide range of tasks, including but not limited to:
|
|
|
11 |
- **Model Size:** 0.5B parameters
|
12 |
- **Model Type:** Instruction-following Language Model
|
13 |
- **Training Data**: About 700 high quality preference entries annotated by GPT-4.
|
14 |
+
- **Training Procedure**: The DPO-Positive algorithm introduced by abacusai was used to train this model.
|
15 |
|
16 |
## Model Use
|
17 |
tau-instruct-0.5B-DPOP is an instruction-following language model designed to follow user instructions and provide assistance across a wide range of tasks, including but not limited to:
|