Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -17,4 +17,4 @@ tags:
 legal的数据我们使用 [Chatgpt关于JEC-QA中国法考数据集的解答](https://raw.githubusercontent.com/AndrewZhe/lawyer-llama/main/data/judical_examination.json) 、 [ChatGPT扮演律师解答问题](https://raw.githubusercontent.com/AndrewZhe/lawyer-llama/main/data/legal_advice.json) 、[法律知识问答](https://github.com/thunlp/CAIL) 三种来源的数据，总计23209条。尽管我们能够找到一些法律真实问答的数据，但此类数据往往带噪（比如不耐烦地回答`“问问你自己吧”`），因此并没有使用
-我们按[chat](https://github.com/Facico/Chinese-Vicuna/blob/master/sample/chat/data_sample.jsonl)格式格式化数据，基于[chatv1](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-7b-chatv1)，使用[continue-training](https://github.com/Facico/Chinese-Vicuna/blob/master/scripts/finetune_chat_continue.sh) 继续训练将近6 epoch；经测试不仅提高了法律问答能力，还能够保留一定的通用问答能力。也可以直接从Llama的基础上直接微调，法律问答能力接近，但不会具备通用问答能力。模型已经上传至[huggingface](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora)


17
18	legal的数据我们使用 [Chatgpt关于JEC-QA中国法考数据集的解答](https://raw.githubusercontent.com/AndrewZhe/lawyer-llama/main/data/judical_examination.json) 、 [ChatGPT扮演律师解答问题](https://raw.githubusercontent.com/AndrewZhe/lawyer-llama/main/data/legal_advice.json) 、[法律知识问答](https://github.com/thunlp/CAIL) 三种来源的数据，总计23209条。尽管我们能够找到一些法律真实问答的数据，但此类数据往往带噪（比如不耐烦地回答`“问问你自己吧”`），因此并没有使用
19
20	+ 我们按[chat](https://github.com/Facico/Chinese-Vicuna/blob/master/sample/chat/data_sample.jsonl)格式格式化数据，基于[chatv1](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-7b-chatv1)，使用[continue-training](https://github.com/Facico/Chinese-Vicuna/blob/master/scripts/finetune_chat_continue.sh) 继续训练将近6 epoch；经测试不仅提高了法律问答能力，还能够保留一定的通用问答能力。