JosephusCheung commited on
Commit
36501a5
·
1 Parent(s): 25d3c1b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -32,6 +32,8 @@ tags:
32
  - llama2
33
  - qwen
34
  ---
 
 
35
  | Model | MT-Bench |
36
  | ------------------------- | ------------ |
37
  | GPT-4 | 8.99 |
@@ -49,6 +51,8 @@ The beta branch will soon be released, employing some aggressive approaches that
49
 
50
  Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine-tuning.
51
 
 
 
52
  需要注意的是,这并不是在 CausalLM/14B & 7B 上继续训练的版本,而是在之前的训练分支上同时进行了 DPO 训练的优化版本,一些细节参数可能发生了变化。 您仍然需要下载完整模型。
53
 
54
  很快将会发布beta分支,采用了一些可能不利于某些任务的激进方法,以实现更好地符合人类偏好以接近和超过GPT-3.5基准。敬请期待。
 
32
  - llama2
33
  - qwen
34
  ---
35
+ For details, please refer to the version without DPO training: [CausalLM/7B](https://huggingface.co/CausalLM/7B).
36
+
37
  | Model | MT-Bench |
38
  | ------------------------- | ------------ |
39
  | GPT-4 | 8.99 |
 
51
 
52
  Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine-tuning.
53
 
54
+ 更多详情,请参见未经DPO训练的版本:[CausalLM/14B](https://huggingface.co/CausalLM/14B)
55
+
56
  需要注意的是,这并不是在 CausalLM/14B & 7B 上继续训练的版本,而是在之前的训练分支上同时进行了 DPO 训练的优化版本,一些细节参数可能发生了变化。 您仍然需要下载完整模型。
57
 
58
  很快将会发布beta分支,采用了一些可能不利于某些任务的激进方法,以实现更好地符合人类偏好以接近和超过GPT-3.5基准。敬请期待。