CausalLM
/

14B-DPO-alpha

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions

JosephusCheung commited on Nov 5, 2023

Commit

26063e9

·

1 Parent(s): f8d90dc

Update README.md

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -32,3 +32,21 @@ tags:
 - llama2
 - qwen
 ---

 - llama2
 - qwen
 ---
+| Model                     | MT-Bench     |
+| ------------------------- | ------------ |
+| GPT-4                     | 8.99         |
+| GPT-3.5-Turbo             | 7.94         |
+|                           |              |
+| Zephyr-7b-β (Overfitting) | 7.34         |
+| Zephyr-7b-α               | 6.88         |
+|                           |              |
+| **CausalLM/14B-DPO-α**    | **7.618868** |
+| **CausalLM/7B-DPO-α**     | **7.038125** |
+It should be noted that this is not a version that continues training on CausalLM/14B & 7B, but rather an optimized version that has undergone DPO training concurrently on a previous training branch, and some detailed parameters may have changed. You will still need to download the full model.
+The beta branch will soon be released, employing some aggressive approaches that might be detrimental in certain tasks, in order to achieve better alignment with human preferences, aiming to meet or exceed the GPT-3.5 benchmarks. Stay tuned.
+需要注意的是，这并不是在 CausalLM/14B & 7B 上继续训练的版本，而是在之前的训练分支上同时进行了 DPO 训练的优化版本，一些细节参数可能发生了变化。 您仍然需要下载完整模型。
+很快将会发布beta分支，采用了一些可能不利于某些任务的激进方法，以实现更好地符合人类偏好以接近和超过GPT-3.5基准。敬请期待。