Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,183 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
license: llama2
|
4 |
+
library_name: transformers
|
5 |
+
tags:
|
6 |
+
- causal-lm
|
7 |
+
- mental-health
|
8 |
+
- text-generation
|
9 |
+
datasets:
|
10 |
+
- heliosbrahma/mental_health_chatbot_dataset
|
11 |
+
model_creator: Jjateen Gundesha
|
12 |
+
base_model: NousResearch/llama-2-7b-chat-hf
|
13 |
+
finetuned_from: NousResearch/llama-2-7b-chat-hf
|
14 |
+
---
|
15 |
+
|
16 |
+
## **🦙Model Card for LLaMA-2-7B-Mental-Chat**
|
17 |
+
|
18 |
+
This model is a fine-tuned version of Meta's LLaMA 2 7B, specifically designed for mental health-focused conversational applications. It provides empathetic, supportive, and informative responses related to mental well-being.
|
19 |
+
|
20 |
+
---
|
21 |
+
|
22 |
+
## Model Details
|
23 |
+
|
24 |
+
### Model Description
|
25 |
+
|
26 |
+
**LLaMA-2-7B-Mental-Chat** is optimized for natural language conversations in mental health contexts. Fine-tuned on a curated dataset of mental health dialogues, it aims to assist with stress management, general well-being, and providing empathetic support.
|
27 |
+
|
28 |
+
- **Developed by:** [Jjateen Gundesha](https://huggingface.co/Jjateen)
|
29 |
+
- **Funded by:** Personal project
|
30 |
+
- **Shared by:** [Jjateen Gundesha](https://huggingface.co/Jjateen)
|
31 |
+
- **Model type:** Transformer-based large language model (LLM)
|
32 |
+
- **Language(s):** English
|
33 |
+
- **License:** [Meta's LLaMA 2 Community License](https://ai.meta.com/llama/license/)
|
34 |
+
- **Fine-tuned from:** [LLaMA 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf)
|
35 |
+
|
36 |
+
---
|
37 |
+
|
38 |
+
### Model Sources
|
39 |
+
|
40 |
+
- **Repository:** [LLaMA-2-7B-Mental-Chat on Hugging Face](https://huggingface.co/Jjateen/llama-2-7b-mental-chat)
|
41 |
+
- **Paper:** Not available
|
42 |
+
- **Demo:** Coming soon
|
43 |
+
|
44 |
+
---
|
45 |
+
|
46 |
+
## Uses
|
47 |
+
|
48 |
+
### Direct Use
|
49 |
+
|
50 |
+
- **Mental Health Chatbot:** For providing empathetic, non-clinical support on mental health topics like anxiety, stress, and general well-being.
|
51 |
+
- **Conversational AI:** Supporting user queries with empathetic responses.
|
52 |
+
|
53 |
+
### Downstream Use
|
54 |
+
|
55 |
+
- **Fine-tuning:** Can be adapted for specialized mental health domains or multilingual support.
|
56 |
+
- **Integration:** Deployable in chatbot frameworks or virtual assistants.
|
57 |
+
|
58 |
+
### Out-of-Scope Use
|
59 |
+
|
60 |
+
- **Clinical diagnosis:** Not suitable for medical or therapeutic advice.
|
61 |
+
- **Crisis management:** Should not be used in critical situations requiring professional intervention.
|
62 |
+
|
63 |
+
---
|
64 |
+
|
65 |
+
## Bias, Risks, and Limitations
|
66 |
+
|
67 |
+
### Biases
|
68 |
+
- May reflect biases from the mental health datasets used, especially around cultural or social norms.
|
69 |
+
- Risk of generating inappropriate or overly simplistic responses to complex issues.
|
70 |
+
|
71 |
+
### Limitations
|
72 |
+
- Not a substitute for professional mental health care.
|
73 |
+
- Limited to English; performance may degrade with non-native phrasing or dialects.
|
74 |
+
|
75 |
+
---
|
76 |
+
|
77 |
+
### Recommendations
|
78 |
+
|
79 |
+
Users should monitor outputs for appropriateness, especially in sensitive or high-stakes situations. Ensure users are aware this is not a replacement for professional mental health services.
|
80 |
+
|
81 |
+
---
|
82 |
+
|
83 |
+
## How to Get Started with the Model
|
84 |
+
|
85 |
+
```python
|
86 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
87 |
+
|
88 |
+
tokenizer = AutoTokenizer.from_pretrained("Jjateen/llama-2-7b-mental-chat")
|
89 |
+
model = AutoModelForCausalLM.from_pretrained("Jjateen/llama-2-7b-mental-chat")
|
90 |
+
|
91 |
+
input_text = "I feel overwhelmed and anxious. What should I do?"
|
92 |
+
inputs = tokenizer(input_text, return_tensors="pt")
|
93 |
+
|
94 |
+
output = model.generate(**inputs, max_length=200)
|
95 |
+
response = tokenizer.decode(output[0], skip_special_tokens=True)
|
96 |
+
print(response)
|
97 |
+
```
|
98 |
+
|
99 |
+
---
|
100 |
+
|
101 |
+
## Training Details
|
102 |
+
|
103 |
+
### Training Data
|
104 |
+
|
105 |
+
- **Dataset:** [heliosbrahma/mental_health_chatbot_dataset](https://huggingface.co/datasets/heliosbrahma/mental_health_chatbot_dataset)
|
106 |
+
- **Preprocessing:** Text normalization, tokenization, and filtering for quality.
|
107 |
+
|
108 |
+
### Training Procedure
|
109 |
+
|
110 |
+
- **Framework:** PyTorch
|
111 |
+
- **Epochs:** 3
|
112 |
+
- **Batch Size:** 8
|
113 |
+
- **Optimizer:** AdamW
|
114 |
+
- **Learning Rate:** 5e-6
|
115 |
+
|
116 |
+
---
|
117 |
+
|
118 |
+
### Speeds, Sizes, Times
|
119 |
+
|
120 |
+
- **Training Time:** Approximately 48 hours on NVIDIA A100 GPUs
|
121 |
+
- **Model Size:** 10.5 GB (split across 2 `.bin` files)
|
122 |
+
|
123 |
+
---
|
124 |
+
|
125 |
+
## Evaluation
|
126 |
+
|
127 |
+
### Testing Data, Factors & Metrics
|
128 |
+
|
129 |
+
#### Testing Data
|
130 |
+
- Held-out validation set with mental health dialogues.
|
131 |
+
|
132 |
+
#### Metrics
|
133 |
+
- **Empathy Score:** Evaluated through human feedback.
|
134 |
+
- **Relevance:** Based on context adherence.
|
135 |
+
- **Perplexity:** Lower perplexity on mental health data compared to the base model.
|
136 |
+
|
137 |
+
### Results
|
138 |
+
| Metric | Score |
|
139 |
+
|------------------|---------------|
|
140 |
+
| **Empathy Score**| 85/100 |
|
141 |
+
| **Relevance** | 90% |
|
142 |
+
| **Safety** | 95% |
|
143 |
+
|
144 |
+
---
|
145 |
+
|
146 |
+
## Environmental Impact
|
147 |
+
|
148 |
+
- **Hardware Type:** NVIDIA A100 GPUs
|
149 |
+
- **Hours used:** 48 hours
|
150 |
+
- **Cloud Provider:** AWS
|
151 |
+
- **Compute Region:** US East
|
152 |
+
- **Carbon Emitted:** Estimated using [ML Impact Calculator](https://mlco2.github.io/impact#compute)
|
153 |
+
|
154 |
+
---
|
155 |
+
|
156 |
+
## Technical Specifications
|
157 |
+
|
158 |
+
### Model Architecture and Objective
|
159 |
+
- Transformer architecture (decoder-only)
|
160 |
+
- Fine-tuned with a causal language modeling objective
|
161 |
+
|
162 |
+
### Compute Infrastructure
|
163 |
+
- **Hardware:** 4x NVIDIA A100 GPUs
|
164 |
+
- **Software:** PyTorch, Hugging Face Transformers
|
165 |
+
|
166 |
+
---
|
167 |
+
|
168 |
+
## Citation
|
169 |
+
|
170 |
+
**BibTeX:**
|
171 |
+
```
|
172 |
+
@misc{jjateen_llama2_mentalchat_2024,
|
173 |
+
title={LLaMA-2-7B-Mental-Chat},
|
174 |
+
author={Jjateen Gundesha},
|
175 |
+
year={2024},
|
176 |
+
howpublished={\url{https://huggingface.co/Jjateen/llama-2-7b-mental-chat}}
|
177 |
+
}
|
178 |
+
```
|
179 |
+
|
180 |
+
---
|
181 |
+
|
182 |
+
## Model Card Contact
|
183 |
+
For any questions or feedback, please contact [Jjateen Gundesha](https://huggingface.co/Jjateen).
|