shareAI
/

llama3-8b-Chinese-Instruct-DPO-beta0.5

Inference Endpoints

Model card Files Files and versions Community

llama3-8b-Chinese-Instruct-DPO-beta0.5 / README.md

Baicai003's picture

Update README.md

d38b71a verified 6 months ago

|

history blame contribute delete

697 Bytes

	---
	license: apache-2.0
	language:
	- zh
	library_name: transformers
	tags:
	- llama
	- llama3
	- dpo-zh
	- emoji
	datasets:
	- shareAI/DPO-zh-en-emoji
	---

	Github：https://github.com/CrazyBoyM/llama3-Chinese-chat
	放出训练配方细节供网友参考分享：
	DPO(beta 0.5) + lora rank128, alpha256 + 打开"lm_head", "input_layernorm", "post_attention_layernorm", "norm"层训练。
	偏好中文和emoji表情，且不损伤原instruct版模型能力。

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/631f5b422225f12fc0f2c838/2xlWxZvN0gahckA2EPmlE.png)

	Git下载
	```
	#Git模型下载
	git clone https://www.modelscope.cn/baicai003/Llama3-Chinese-instruct-DPO-beta0.5.git
	```