Update README.md
Browse files
README.md
CHANGED
@@ -11,9 +11,9 @@ model-index:
|
|
11 |
license: artistic-2.0
|
12 |
---
|
13 |
|
14 |
-
# juanako-7b-v1
|
15 |
|
16 |
-
This model is a fine-tuned version of [fblgit/zephyr-lora-dpo-b1](https://huggingface.co/fblgit/zephyr-lora-dpo-b1) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 0.4594
|
19 |
- Rewards/chosen: -1.1095
|
@@ -27,7 +27,7 @@ It achieves the following results on the evaluation set:
|
|
27 |
|
28 |
Followed [alignment-handbook](https://github.com/huggingface/alignment-handbook) to perform DPO (Phase 2) over Zephyr-SFT model.
|
29 |
|
30 |
-
**Please feel free to run more tests and commit the results. Also if you are interested to participate in [UNA's paper research or GPU sponsorship](mailto:[email protected])
|
31 |
|
32 |
Special thanks to [TheBloke](https://huggingface.co/TheBloke) for converting the model into multiple formats and overall his enormous contribution to the community.
|
33 |
Here are the models:
|
@@ -263,6 +263,7 @@ hf (pretrained=fblgit/juanako-7b-v1,load_in_4bit=False,dtype=float16), limit: No
|
|
263 |
| - stem |N/A |none |acc |0.5217|± |0.1149|
|
264 |
|
265 |
### Citations
|
|
|
266 |
|
267 |
@misc{tunstall2023zephyr,
|
268 |
title={Zephyr: Direct Distillation of LM Alignment},
|
|
|
11 |
license: artistic-2.0
|
12 |
---
|
13 |
|
14 |
+
# juanako-7b-v1 (UNA: Uniform Neural Alignment)
|
15 |
|
16 |
+
This model uses uniform neural alignment (UNA) for the DPO training phases and is a fine-tuned version of [fblgit/zephyr-lora-dpo-b1](https://huggingface.co/fblgit/zephyr-lora-dpo-b1) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 0.4594
|
19 |
- Rewards/chosen: -1.1095
|
|
|
27 |
|
28 |
Followed [alignment-handbook](https://github.com/huggingface/alignment-handbook) to perform DPO (Phase 2) over Zephyr-SFT model.
|
29 |
|
30 |
+
**Please feel free to run more tests and commit the results. Also if you are interested to participate in [UNA's paper research or GPU sponsorship](mailto:[email protected]) to support UNA research, feel free to contact.**
|
31 |
|
32 |
Special thanks to [TheBloke](https://huggingface.co/TheBloke) for converting the model into multiple formats and overall his enormous contribution to the community.
|
33 |
Here are the models:
|
|
|
263 |
| - stem |N/A |none |acc |0.5217|± |0.1149|
|
264 |
|
265 |
### Citations
|
266 |
+
Please feel free to raise a PR if there is any missing citation.
|
267 |
|
268 |
@misc{tunstall2023zephyr,
|
269 |
title={Zephyr: Direct Distillation of LM Alignment},
|