Update README.md
Browse files
README.md
CHANGED
@@ -6,197 +6,101 @@ tags:
|
|
6 |
- sft
|
7 |
---
|
8 |
|
9 |
-
# Model Card for
|
10 |
-
|
11 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
12 |
-
|
13 |
-
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
### Model Description
|
18 |
|
19 |
-
|
|
|
|
|
|
|
20 |
|
21 |
-
|
22 |
|
23 |
-
|
24 |
-
- **Funded by [optional]:** [More Information Needed]
|
25 |
-
- **Shared by [optional]:** [More Information Needed]
|
26 |
-
- **Model type:** [More Information Needed]
|
27 |
-
- **Language(s) (NLP):** [More Information Needed]
|
28 |
-
- **License:** [More Information Needed]
|
29 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
30 |
|
31 |
-
|
32 |
|
33 |
-
|
34 |
|
35 |
-
- **Repository:** [More Information Needed]
|
36 |
-
- **Paper [optional]:** [More Information Needed]
|
37 |
-
- **Demo [optional]:** [More Information Needed]
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
40 |
|
41 |
-
|
42 |
|
43 |
-
|
|
|
|
|
44 |
|
45 |
-
|
46 |
|
47 |
-
|
48 |
|
49 |
-
|
|
|
50 |
|
51 |
-
|
52 |
|
53 |
-
|
54 |
|
55 |
### Out-of-Scope Use
|
56 |
|
57 |
-
|
58 |
-
|
59 |
-
[More Information Needed]
|
60 |
|
61 |
## Bias, Risks, and Limitations
|
62 |
|
63 |
-
|
64 |
-
|
65 |
-
[More Information Needed]
|
66 |
|
67 |
### Recommendations
|
68 |
|
69 |
-
|
70 |
-
|
71 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
72 |
|
73 |
## How to Get Started with the Model
|
74 |
|
75 |
-
|
76 |
-
|
77 |
-
[More Information Needed]
|
78 |
|
79 |
## Training Details
|
80 |
|
81 |
### Training Data
|
82 |
|
83 |
-
|
84 |
-
|
85 |
-
[More Information Needed]
|
86 |
|
87 |
### Training Procedure
|
88 |
|
89 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
90 |
-
|
91 |
-
#### Preprocessing [optional]
|
92 |
-
|
93 |
-
[More Information Needed]
|
94 |
-
|
95 |
-
|
96 |
#### Training Hyperparameters
|
97 |
|
98 |
-
- **Training regime:**
|
99 |
-
|
100 |
-
#### Speeds, Sizes, Times [optional]
|
101 |
-
|
102 |
-
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
103 |
-
|
104 |
-
[More Information Needed]
|
105 |
|
106 |
## Evaluation
|
107 |
|
108 |
-
<!-- This section describes the evaluation protocols and provides the results. -->
|
109 |
-
|
110 |
### Testing Data, Factors & Metrics
|
111 |
|
112 |
#### Testing Data
|
113 |
|
114 |
-
|
115 |
-
|
116 |
-
[More Information Needed]
|
117 |
-
|
118 |
-
#### Factors
|
119 |
-
|
120 |
-
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
121 |
-
|
122 |
-
[More Information Needed]
|
123 |
|
124 |
#### Metrics
|
125 |
|
126 |
-
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
127 |
-
|
128 |
-
[More Information Needed]
|
129 |
|
130 |
### Results
|
131 |
|
132 |
-
|
133 |
-
|
134 |
-
#### Summary
|
135 |
|
136 |
|
137 |
-
|
138 |
-
## Model Examination [optional]
|
139 |
-
|
140 |
-
<!-- Relevant interpretability work for the model goes here -->
|
141 |
-
|
142 |
-
[More Information Needed]
|
143 |
-
|
144 |
-
## Environmental Impact
|
145 |
-
|
146 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
147 |
-
|
148 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
149 |
-
|
150 |
-
- **Hardware Type:** [More Information Needed]
|
151 |
-
- **Hours used:** [More Information Needed]
|
152 |
-
- **Cloud Provider:** [More Information Needed]
|
153 |
-
- **Compute Region:** [More Information Needed]
|
154 |
-
- **Carbon Emitted:** [More Information Needed]
|
155 |
-
|
156 |
-
## Technical Specifications [optional]
|
157 |
|
158 |
### Model Architecture and Objective
|
159 |
|
160 |
-
|
161 |
-
|
162 |
-
### Compute Infrastructure
|
163 |
-
|
164 |
-
[More Information Needed]
|
165 |
-
|
166 |
-
#### Hardware
|
167 |
-
|
168 |
-
[More Information Needed]
|
169 |
-
|
170 |
-
#### Software
|
171 |
|
172 |
-
|
173 |
-
|
174 |
-
## Citation [optional]
|
175 |
-
|
176 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
177 |
|
178 |
**BibTeX:**
|
179 |
|
180 |
-
[More Information Needed]
|
181 |
-
|
182 |
-
**APA:**
|
183 |
-
|
184 |
-
[More Information Needed]
|
185 |
-
|
186 |
-
## Glossary [optional]
|
187 |
-
|
188 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
189 |
-
|
190 |
-
[More Information Needed]
|
191 |
-
|
192 |
-
## More Information [optional]
|
193 |
-
|
194 |
-
[More Information Needed]
|
195 |
-
|
196 |
-
## Model Card Authors [optional]
|
197 |
-
|
198 |
-
[More Information Needed]
|
199 |
-
|
200 |
-
## Model Card Contact
|
201 |
-
|
202 |
-
[More Information Needed]
|
|
|
6 |
- sft
|
7 |
---
|
8 |
|
9 |
+
# Model Card for CBTLlama: Fine Tuning LLaMA for CBT Thought Distortions
|
|
|
|
|
|
|
|
|
10 |
|
11 |
## Model Details
|
12 |
|
13 |
### Model Description
|
14 |
|
15 |
+
Developed by David Schiff, this Hugging Face transformers model, dubbed CBTLlama, is fine-tuned on the LLaMA-3 8B architecture.
|
16 |
+
It is specifically tailored to enhance Cognitive Behavioral Therapy (CBT) by detecting thought distortions and raising possible challenges for them
|
17 |
+
their challenges. The model uses demographic and emotional state inputs to produce CBT scenarios, aiming to make CBT more accessible and effective.
|
18 |
+
This model is not inteded to use without any professional assistance!
|
19 |
|
20 |
+
## Disclaimer
|
21 |
|
22 |
+
### Limitation of Liability
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
+
The developer of CBTLlama ("the model") provides this model on an "AS IS" basis and makes no warranties regarding its performance, accuracy, reliability, or suitability for any particular task or to achieve any specific results. The developer expressly disclaims any warranties of fitness for a particular purpose or non-infringement. In no event shall the developer be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this model, even if advised of the possibility of such damage.
|
25 |
|
26 |
+
This model is not intended to be a substitute for professional advice, diagnosis, or treatment. Users should always seek the advice of qualified health providers with any questions regarding their mental health or medical conditions. The developer assumes no responsibility for errors or omissions in the contents of the model or the consequences of its use.
|
27 |
|
|
|
|
|
|
|
28 |
|
29 |
+
- **Developed by:** David Schiff
|
30 |
+
- **Model type:** Fine-tuned LLaMA-3 8B
|
31 |
+
- **Language(s) (NLP):** English
|
32 |
+
- **License:** MIT
|
33 |
+
- **Finetuned from model:** LLaMA-3 8B
|
34 |
|
35 |
+
### Model Sources
|
36 |
|
37 |
+
- **Repository:** (URL to GitHub or similar)
|
38 |
+
- **Paper [optional]:** (Link to any published research or documentation)
|
39 |
+
- **Demo [optional]:** (Link to a model demonstration or interactive API)
|
40 |
|
41 |
+
## Uses
|
42 |
|
43 |
+
### Direct Use
|
44 |
|
45 |
+
CBTLlama is intended to be used directly by mental health practitioners to train their patients in identifying cognitive distortions
|
46 |
+
and challenging them.
|
47 |
|
48 |
+
### Downstream Use
|
49 |
|
50 |
+
While primarily designed for CBT, this model could be extended to other forms of therapy that require scenario generation or tailored mental health interventions.
|
51 |
|
52 |
### Out-of-Scope Use
|
53 |
|
54 |
+
This model is not intended to replace therapists or make clinical decisions. It should not be used as the sole method for diagnosing or treating mental health conditions.
|
|
|
|
|
55 |
|
56 |
## Bias, Risks, and Limitations
|
57 |
|
58 |
+
The model might exhibit biases based on the demographic data it was trained on. Users should critically assess the scenarios it generates,
|
59 |
+
especially when using the model with diverse populations.
|
|
|
60 |
|
61 |
### Recommendations
|
62 |
|
63 |
+
It is recommended that all outputs be reviewed by qualified professionals to ensure they are appropriate and sensitive to individual circumstances.
|
|
|
|
|
64 |
|
65 |
## How to Get Started with the Model
|
66 |
|
67 |
+
To start using CBTLlama, you can access the model via the Hugging Face API or download it directly from the repository.
|
|
|
|
|
68 |
|
69 |
## Training Details
|
70 |
|
71 |
### Training Data
|
72 |
|
73 |
+
The training data comprised simulated CBT scenarios generated by Claude, based on diverse demographic profiles and emotional states, ensuring broad coverage of potential therapy situations.
|
|
|
|
|
74 |
|
75 |
### Training Procedure
|
76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
#### Training Hyperparameters
|
78 |
|
79 |
+
- **Training regime:** Mixed precision training for efficiency
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
## Evaluation
|
82 |
|
|
|
|
|
83 |
### Testing Data, Factors & Metrics
|
84 |
|
85 |
#### Testing Data
|
86 |
|
87 |
+
The testing involved real-world CBT session scenarios evaluated by mental health professionals to validate the realism and utility of the model-generated content.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
|
89 |
#### Metrics
|
90 |
|
|
|
|
|
|
|
91 |
|
92 |
### Results
|
93 |
|
94 |
+
Results indicated that CBTLlama produces highly accurate detections and challenges of thought distortions.
|
|
|
|
|
95 |
|
96 |
|
97 |
+
## Technical Specifications
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
98 |
|
99 |
### Model Architecture and Objective
|
100 |
|
101 |
+
The model utilizes the LLaMA-3 architecture with modifications to specifically suit CBT scenario generation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
|
103 |
+
## Citation
|
|
|
|
|
|
|
|
|
104 |
|
105 |
**BibTeX:**
|
106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|