|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- zh |
|
metrics: |
|
- bleu |
|
- rouge |
|
tags: |
|
- medical |
|
--- |
|
# MindGLM: A Fine-tuned Language Model for Chinese Psychological Counseling |
|
1. Introduction |
|
MindGLM is a large language model fine-tuned and aligned for the task of psychological counseling in Chinese. Developed from the foundational model ChatGLM2-6B, MindGLM is designed to resonate with human preferences in psychological inquiries, offering a reliable and safe tool for digital psychological counseling. |
|
|
|
2. Key Features |
|
- Fine-tuned for Counseling: MindGLM has been meticulously trained to understand and respond to psychological inquiries, ensuring empathetic and accurate responses. |
|
|
|
- Aligned with Human Preferences: The model underwent a rigorous alignment process, ensuring its responses are in line with human values and preferences in the realm of psychological counseling. |
|
|
|
- High Performance: MindGLM has demonstrated superior performance in both quantitative and qualitative evaluations, making it a leading choice for digital psychological interventions. |
|
|
|
3. Usage |
|
To use MindGLM with the Hugging Face Transformers library: |
|
|
|
' |
|
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM") |
|
model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM") |
|
history = [] |
|
end_command = "结束对话" |
|
max_length = 600 |
|
max_turns = 12 |
|
|
|
instruction = "假设你是友善的心理辅导师,避免透露任何AI或技术的信息。请主动引导咨询者交流并帮助他们解决心理问题。" |
|
|
|
# 获取用户输入 |
|
prompt = input("求助者:") |
|
|
|
# 添加前缀 |
|
prompt = instruction + "求助者:" + prompt + "。支持者:" |
|
response, history = model.chat(tokenizer, prompt, history=[]) |
|
print("支持者: " + response) |
|
|
|
while True: |
|
# 获取用户输入 |
|
user_input = input("求助者:") |
|
|
|
# 检查是否收到结束指令 |
|
if user_input == end_command: |
|
break |
|
|
|
# 添加前缀 |
|
prompt = "求助者:" + user_input + "。支持者:" |
|
response, history = model.chat(tokenizer, prompt, history=history) |
|
print("支持者: " + response) |
|
|
|
# 计算历史记录总字符长度,包括新对话 |
|
total_length = len(instruction) + sum(len(f'{item[0]} {item[1]}') for item in history) |
|
|
|
# 如果历史记录字符数超过限制或者对话轮数超过限制,删除一些旧的对话 |
|
while total_length > max_length or len(history) > max_turns: |
|
# 删除最旧的一条对话 |
|
removed_item = history.pop(1) # 第一条是instruction,删除第二条 |
|
# print('删除:',removed_item) |
|
total_length -= len(f'{removed_item[0]} {removed_item[1]}') |
|
|
|
' |
|
|
|
4. Training Data |
|
MindGLM was trained using a combination of open-source datasets and self-constructed datasets, ensuring a comprehensive understanding of psychological counseling scenarios. The datasets include SmileConv, comparison_data_v1, psychology-RLAIF, rm_labelled_180, and rm_gpt_375. |
|
|
|
5. Training Process |
|
The model underwent a three-phase training approach: |
|
|
|
- Supervised Fine-tuning: Using the ChatGLM2-6B foundational model, MindGLM was fine-tuned with a dedicated dataset for psychological counseling. |
|
|
|
- Reward Model Training: A reward model was trained to evaluate and score the responses of the fine-tuned model. |
|
|
|
- Reinforcement Learning: The model was further aligned using the PPO (Proximal Policy Optimization) algorithm to ensure its responses align with human preferences. |
|
|
|
6. Limitations |
|
While MindGLM is a powerful tool, users should be aware of its limitations: |
|
|
|
It is designed for psychological counseling but should not replace professional medical advice or interventions. |
|
|
|
The model's responses are based on the training data, and while it's aligned with human preferences, it might not always provide the most appropriate response. |
|
|
|
7. License |
|
Please refer to the licensing terms of the datasets used for training. Usage of MindGLM should be in compliance with these licenses.license: apache-2.0 |
|
|
|
8. Contact Information |
|
For any queries, feedback, or collaboration opportunities, please reach out to: |
|
|
|
- Name: [Congmian Zhang] |
|
- Email: [[email protected]] |
|
- wechat: [Zhang_CNN] |
|
- Affiliation: [university of glasgow] |
|
- We hope MindGLM proves to be a valuable asset in the realm of digital psychological counseling for the Chinese-speaking community. Your feedback and contributions are always welcome! |
|
|
|
|
|
# MindGLM: 针对中文心理咨询任务的对齐大语言模型 |
|
1. 简介 |
|
MindGLM 是针对中文心理咨询任务进行微调和对齐的大型语言模型。MindGLM 由基础模型 ChatGLM2-6B 发展而来,旨在与人类的心理咨询偏好产生共鸣,为数字心理咨询提供可靠、安全的工具。 |
|
|
|
2. 主要特点 |
|
- 针对心理咨询进行微调: MindGLM 经过细致的训练,能够理解和回应心理咨询,确保以同理心做出准确的回应。 |
|
|
|
- 符合人类偏好: 该模型经过了严格的调整过程,确保其响应符合心理咨询领域的人类价值观和偏好。 |
|
|
|
- 高性能: MindGLM 在定量和定性评估中均表现出卓越的性能,使其成为数字心理干预的首选。 |
|
|
|
3. 使用方法 |
|
将 MindGLM 与 "transformer "一起使用: |
|
|
|
' |
|
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM") |
|
model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM") |
|
history = [] |
|
end_command = "结束对话" |
|
max_length = 600 |
|
max_turns = 12 |
|
|
|
instruction = "假设你是友善的心理辅导师,避免透露任何AI或技术的信息。请主动引导咨询者交流并帮助他们解决心理问题。" |
|
|
|
# 获取用户输入 |
|
prompt = input("求助者:") |
|
|
|
# 添加前缀 |
|
prompt = instruction + "求助者:" + prompt + "。支持者:" |
|
response, history = model.chat(tokenizer, prompt, history=[]) |
|
print("支持者: " + response) |
|
|
|
while True: |
|
# 获取用户输入 |
|
user_input = input("求助者:") |
|
|
|
# 检查是否收到结束指令 |
|
if user_input == end_command: |
|
break |
|
|
|
# 添加前缀 |
|
prompt = "求助者:" + user_input + "。支持者:" |
|
response, history = model.chat(tokenizer, prompt, history=history) |
|
print("支持者: " + response) |
|
|
|
# 计算历史记录总字符长度,包括新对话 |
|
total_length = len(instruction) + sum(len(f'{item[0]} {item[1]}') for item in history) |
|
|
|
# 如果历史记录字符数超过限制或者对话轮数超过限制,删除一些旧的对话 |
|
while total_length > max_length or len(history) > max_turns: |
|
# 删除最旧的一条对话 |
|
removed_item = history.pop(1) # 第一条是instruction,删除第二条 |
|
# print('删除:',removed_item) |
|
total_length -= len(f'{removed_item[0]} {removed_item[1]}') |
|
|
|
' |
|
|
|
4. 训练数据 |
|
MindGLM 结合使用开源数据集和自建数据集进行训练,以确保全面了解心理咨询场景。这些数据集包括 SmileConv、comparison_data_v1、psychology-RLAIF、rm_labelled_180 和 rm_gpt_375。 |
|
|
|
5. 训练过程 |
|
该模型采用了三阶段训练方法:(均使用了LoRA) |
|
|
|
- 监督微调: 使用 ChatGLM2-6B 基础模型,用心理咨询专用数据集对 MindGLM 进行微调。 |
|
|
|
- 奖励模型训练: 对奖励模型进行训练,以对微调模型的响应进行评估和评分。 |
|
|
|
- 强化学习: 使用 PPO(近端策略优化)算法对模型进行进一步调整,以确保其反应符合人类的偏好。 |
|
|
|
6. 局限性 |
|
虽然 MindGLM 是一款功能强大的工具,但用户也应了解其局限性: |
|
|
|
它是为心理咨询而设计的,但不应取代专业医疗建议或干预。 |
|
|
|
模型的反应是基于训练数据的,虽然它与人类的偏好相一致,但可能并不总是提供最合适的反应。 |
|
|
|
7. 使用许可 |
|
请参考用于训练的数据集的许可条款。MindGLM的使用应遵守这些许可。 |
|
|
|
8. 联系信息 |
|
如有任何疑问、反馈或合作机会,请联系: |
|
|
|
- 姓名: [张淙冕] |
|
- 电子邮件 [[email protected]] |
|
- 微信 [Zhang_CNN] |
|
- 所属单位: [格拉斯哥大学] |
|
- 我们希望 MindGLM 能够成为中文社区数字心理咨询领域的宝贵财富。我们随时欢迎您的反馈和贡献! |