abagherp commited on
Commit
6830eb0
·
verified ·
1 Parent(s): 506bc5c

Upload folder using huggingface_hub

Browse files
.github/workflows/update_space.yml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Run Python script
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ jobs:
9
+ build:
10
+ runs-on: ubuntu-latest
11
+
12
+ steps:
13
+ - name: Checkout
14
+ uses: actions/checkout@v2
15
+
16
+ - name: Set up Python
17
+ uses: actions/setup-python@v2
18
+ with:
19
+ python-version: '3.9'
20
+
21
+ - name: Install Gradio
22
+ run: python -m pip install gradio
23
+
24
+ - name: Log in to Hugging Face
25
+ run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
26
+
27
+ - name: Deploy to Spaces
28
+ run: gradio deploy
.gitignore ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ instruction.md
2
+ .env
3
+ *.pyc
4
+ .langchain.db
5
+ *.db
6
+ .gradio/*
7
+ .venv/*
8
+ __pycache__/
9
+ *.py[cod]
10
+ *$py.class
11
+ *.so
12
+ .Python
13
+ env/
14
+ build/
15
+ develop-eggs/
16
+ dist/
17
+ downloads/
18
+ eggs/
19
+ .eggs/
20
+ lib/
21
+ lib64/
22
+ parts/
23
+ sdist/
24
+ var/
25
+ wheels/
26
+ *.egg-info/
27
+ .installed.cfg
28
+ *.egg
29
+
30
+ # Virtual Environment
31
+ venv/
32
+ ENV/
33
+
34
+ # IDE
35
+ .idea/
36
+ .vscode/
37
+ *.swp
38
+ *.swo
39
+
40
+ # Gradio
41
+ .gradio/
42
+ flagged/
43
+
44
+ # Project specific
45
+ .env
46
+ .langchain.db
47
+ cache/
48
+ cache/*.db
49
+ config/credentials.yaml
50
+
51
+ # Data files
52
+ data/*.mp3
53
+ data/*.wav
54
+ data/*.aac
55
+ data/*.ogg
56
+ data/*.flac
57
+ !data/CBT Role-Play.mp3 # Include our sample audio file
58
+
59
+ # Logs
60
+ *.log
README.md CHANGED
@@ -1,12 +1,100 @@
1
  ---
2
  title: TherapyNote
3
- emoji: 📊
4
- colorFrom: pink
5
- colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 5.9.0
8
- app_file: app.py
9
- pinned: false
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
  title: TherapyNote
3
+ app_file: app.py
 
 
4
  sdk: gradio
5
  sdk_version: 5.9.0
6
+ organization: pxpab
 
7
  ---
8
+ # Therapy Session Analysis Pipeline
9
+
10
+ A Python project that downloads YouTube therapy session captions and extracts structured information using LLMs, LangChain, and LangGraph.
11
+
12
+ ## Features
13
+
14
+ - Downloads captions from YouTube therapy sessions
15
+ - Extracts structured information using LLMs and LangChain
16
+ - Supports multiple note formats (SOAP, DAP, BIRP, etc.)
17
+ - Uses LangGraph for data extraction workflows
18
+ - Manages prompts in a dedicated "langhub" directory
19
+ - Integrates with LangSmith for conversation and run logging
20
+
21
+ ## Prerequisites
22
+
23
+ - Python 3.9+
24
+ - uv package manager
25
+ - OpenAI API key
26
+ - LangChain API key (for logging)
27
+
28
+ ## Installation
29
+
30
+ 1. Clone the repository:
31
+ ```bash
32
+ git clone https://github.com/yourusername/therapy-session-analysis.git
33
+ cd therapy-session-analysis
34
+ ```
35
+
36
+ 2. Install dependencies using uv:
37
+ ```bash
38
+ uv pip install -r requirements.txt
39
+ ```
40
+
41
+ 3. Set up environment variables:
42
+ ```bash
43
+ export OPENAI_API_KEY="your-openai-key"
44
+ export LANGCHAIN_API_KEY="your-langchain-key"
45
+ export LANGCHAIN_TRACING_V2="true"
46
+ ```
47
+
48
+ ## Project Structure
49
+
50
+ ```
51
+ project/
52
+ ├── config/
53
+ │ ├── __init__.py
54
+ │ └── settings.py
55
+ ├── langhub/
56
+ │ ├── __init__.py
57
+ │ └── prompts/
58
+ │ ├── __init__.py
59
+ │ └── therapy_extraction_prompt.yaml
60
+ ├── forms/
61
+ │ ├── __init__.py
62
+ │ └── schemas.py
63
+ ├── utils/
64
+ │ ├── __init__.py
65
+ │ ├── youtube.py
66
+ │ └── text_processing.py
67
+ ├── models/
68
+ │ ├── __init__.py
69
+ │ └── llm_provider.py
70
+ ├── main.py
71
+ ├── requirements.txt
72
+ └── README.md
73
+ ```
74
+
75
+ ## Usage
76
+
77
+ Run the main script:
78
+ ```bash
79
+ python main.py
80
+ ```
81
+
82
+ ## Note Formats
83
+
84
+ The system supports multiple therapy note formats:
85
+ - SOAP (Subjective, Objective, Assessment, Plan)
86
+ - DAP (Data, Assessment, Plan)
87
+ - BIRP (Behavior, Intervention, Response, Plan)
88
+ - And more...
89
+
90
+ ## Contributing
91
+
92
+ 1. Fork the repository
93
+ 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
94
+ 3. Commit your changes (`git commit -m 'Add some amazing feature'`)
95
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
96
+ 5. Open a Pull Request
97
+
98
+ ## License
99
 
100
+ This project is licensed under the MIT License - see the LICENSE file for details.
app.py ADDED
@@ -0,0 +1,352 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import os
3
+ from pathlib import Path
4
+ import yaml
5
+ import gradio as gr
6
+ from typing import Optional
7
+
8
+ from langchain_core.prompts import ChatPromptTemplate
9
+ from langchain_core.messages import HumanMessage, SystemMessage
10
+
11
+ from config.settings import settings
12
+ from forms.schemas import (
13
+ SOAPNote, DAPNote, BIRPNote, PIRPNote, GIRPNote, SIRPNote,
14
+ FAIRFDARPNote, DARENote, PIENote, SOAPIERNote, SOAPIENote,
15
+ POMRNote, NarrativeNote, CBENote, SBARNote
16
+ )
17
+ from utils.youtube import download_transcript
18
+ from utils.youtube import extract_youtube_video_id
19
+ from utils.text_processing import chunk_text
20
+ from utils.audio import transcribe_audio
21
+ from models.llm_provider import get_llm, get_model_identifier
22
+ from utils.cache import CacheManager
23
+ from config.auth import load_auth_credentials
24
+
25
+ # Dictionary mapping form types to their schemas
26
+ FORM_SCHEMAS = {
27
+ "SOAP": SOAPNote,
28
+ "DAP": DAPNote,
29
+ "BIRP": BIRPNote,
30
+ "PIRP": PIRPNote,
31
+ "GIRP": GIRPNote,
32
+ "SIRP": SIRPNote,
33
+ "FAIR/F-DARP": FAIRFDARPNote,
34
+ "DARE": DARENote,
35
+ "PIE": PIENote,
36
+ "SOAPIER": SOAPIERNote,
37
+ "SOAPIE": SOAPIENote,
38
+ "POMR": POMRNote,
39
+ "Narrative": NarrativeNote,
40
+ "CBE": CBENote,
41
+ "SBAR": SBARNote,
42
+ }
43
+
44
+ # Initialize cache manager
45
+ cache_manager = CacheManager()
46
+
47
+ def load_prompt(note_type: str) -> tuple[str, str]:
48
+ """Load the prompt template from YAML for the specified note type."""
49
+ prompt_path = Path("langhub/prompts/therapy_extraction_prompt.yaml")
50
+ with open(prompt_path, "r") as f:
51
+ data = yaml.safe_load(f)
52
+
53
+ note_prompts = data.get("prompts", {}).get(note_type.lower())
54
+ if not note_prompts:
55
+ raise ValueError(f"No prompt template found for note type: {note_type}")
56
+
57
+ return note_prompts["system"], note_prompts["human"]
58
+
59
+ def process_input(
60
+ input_text: str,
61
+ form_type: str,
62
+ input_type: str = "text",
63
+ audio_file: str | None = None,
64
+ force_refresh: bool = False
65
+ ) -> str:
66
+ """Process input (text, YouTube URL, or audio) and generate notes."""
67
+ try:
68
+ # Get transcript based on input type
69
+ if input_type == "audio" and audio_file:
70
+ print("Processing audio file...")
71
+ transcript = transcribe_audio(audio_file)
72
+ elif "youtube.com" in input_text or "youtu.be" in input_text:
73
+ print(f"Downloading transcript from YouTube...")
74
+ video_id = extract_youtube_video_id(input_text)
75
+
76
+ # Check cache first
77
+ if not force_refresh:
78
+ cached_transcript = cache_manager.get_transcript(video_id)
79
+ if cached_transcript:
80
+ print("Using cached transcript...")
81
+ transcript = cached_transcript
82
+ else:
83
+ transcript = download_transcript(input_text)
84
+ cache_manager.store_transcript(video_id, transcript)
85
+ else:
86
+ transcript = download_transcript(input_text)
87
+ cache_manager.store_transcript(video_id, transcript)
88
+ else:
89
+ print("Using provided text directly...")
90
+ transcript = input_text
91
+
92
+ # Initialize LLM
93
+ llm = get_llm()
94
+ model_id = get_model_identifier(llm)
95
+
96
+ # Check extraction cache
97
+ if not force_refresh:
98
+ cached_result = cache_manager.get_extraction(
99
+ transcript,
100
+ form_type.lower(),
101
+ model_id
102
+ )
103
+ if cached_result:
104
+ print("Using cached extraction result...")
105
+ formatted_response = yaml.dump(
106
+ cached_result,
107
+ default_flow_style=False,
108
+ sort_keys=False
109
+ )
110
+ return f"## {form_type} Note:\n```yaml\n{formatted_response}\n```"
111
+
112
+ # Get schema for selected form type
113
+ schema = FORM_SCHEMAS.get(form_type)
114
+ if not schema:
115
+ return f"Error: Unsupported form type {form_type}"
116
+
117
+ # Create structured LLM
118
+ structured_llm = llm.with_structured_output(schema=schema)
119
+
120
+ # Load prompts
121
+ system_prompt, human_prompt = load_prompt(form_type.lower())
122
+
123
+ # Create prompt template
124
+ prompt = ChatPromptTemplate.from_messages([
125
+ ("system", system_prompt),
126
+ ("human", human_prompt)
127
+ ])
128
+
129
+ # Process transcript
130
+ print(f"Generating {form_type} note...")
131
+ response = structured_llm.invoke(transcript)
132
+
133
+ # Store result in cache
134
+ result_dict = response.model_dump(exclude_unset=False, exclude_none=False)
135
+ cache_manager.store_extraction(
136
+ transcript,
137
+ form_type.lower(),
138
+ result_dict,
139
+ model_id
140
+ )
141
+
142
+ # Format the response
143
+ formatted_response = yaml.dump(
144
+ result_dict,
145
+ default_flow_style=False,
146
+ sort_keys=False
147
+ )
148
+
149
+ return f"## {form_type} Note:\n```yaml\n{formatted_response}\n```"
150
+
151
+ except Exception as e:
152
+ return f"Error: {str(e)}"
153
+
154
+ def create_ui() -> gr.Blocks:
155
+ """Create the Gradio interface."""
156
+
157
+ # Load authorized users from config
158
+ auth = load_auth_credentials()
159
+
160
+ def check_auth(username: str, password: str) -> bool:
161
+ """Check if username and password are valid."""
162
+ return username in auth and auth[username] == password
163
+
164
+ with gr.Blocks(title="Therapy Note Generator") as demo:
165
+ # Login interface
166
+ with gr.Row():
167
+ with gr.Column():
168
+ username = gr.Textbox(label="Username")
169
+ password = gr.Textbox(label="Password", type="password")
170
+ login_btn = gr.Button("Login")
171
+ login_msg = gr.Markdown()
172
+
173
+ # Main interface (initially invisible)
174
+ with gr.Column(visible=False) as main_interface:
175
+ gr.Markdown("# Therapy Note Generator")
176
+ gr.Markdown("""
177
+ Enter a YouTube URL, paste a transcript directly, or upload an audio file.
178
+ Select the desired note format and click 'Generate' to create a structured note.
179
+ """)
180
+
181
+ with gr.Row():
182
+ with gr.Column():
183
+ # Input type selector
184
+ input_type = gr.Radio(
185
+ choices=["text", "youtube", "audio"],
186
+ value="text",
187
+ label="Input Type",
188
+ info="Choose how you want to provide the therapy session"
189
+ )
190
+
191
+ # Text input for transcript or YouTube URL
192
+ input_text = gr.Textbox(
193
+ label="Text Input",
194
+ placeholder="Enter transcript or YouTube URL here...",
195
+ lines=10,
196
+ visible=True
197
+ )
198
+
199
+ # Audio upload
200
+ audio_input = gr.Audio(
201
+ label="Audio Input",
202
+ type="filepath",
203
+ visible=False
204
+ )
205
+
206
+ # Note format selector
207
+ form_type = gr.Dropdown(
208
+ choices=list(FORM_SCHEMAS.keys()),
209
+ value="SOAP",
210
+ label="Note Format"
211
+ )
212
+
213
+ generate_btn = gr.Button("Generate Note", variant="primary")
214
+
215
+ with gr.Column():
216
+ # Transcript output
217
+ transcript_output = gr.Textbox(
218
+ label="Generated Transcript",
219
+ lines=10,
220
+ visible=False,
221
+ interactive=False
222
+ )
223
+ # Structured note output
224
+ note_output = gr.Markdown(label="Generated Note")
225
+
226
+ # Update visibility based on input type
227
+ def update_inputs(choice):
228
+ return {
229
+ input_text: gr.update(visible=choice in ["text", "youtube"]),
230
+ audio_input: gr.update(visible=choice == "audio"),
231
+ transcript_output: gr.update(visible=choice in ["youtube", "audio"])
232
+ }
233
+
234
+ input_type.change(
235
+ fn=update_inputs,
236
+ inputs=input_type,
237
+ outputs=[input_text, audio_input, transcript_output]
238
+ )
239
+
240
+ def process_and_show_transcript(
241
+ input_text: str,
242
+ form_type: str,
243
+ input_type: str = "text",
244
+ audio_file: str | None = None,
245
+ force_refresh: bool = False
246
+ ) -> tuple[str, str]:
247
+ """Process input and return both transcript and structured note."""
248
+ try:
249
+ # Get transcript based on input type
250
+ if input_type == "audio" and audio_file:
251
+ print("Processing audio file...")
252
+ transcript = transcribe_audio(audio_file)
253
+ elif "youtube.com" in input_text or "youtu.be" in input_text:
254
+ print(f"Downloading transcript from YouTube...")
255
+ video_id = extract_youtube_video_id(input_text)
256
+
257
+ # Check cache first
258
+ if not force_refresh:
259
+ cached_transcript = cache_manager.get_transcript(video_id)
260
+ if cached_transcript:
261
+ print("Using cached transcript...")
262
+ transcript = cached_transcript
263
+ else:
264
+ transcript = download_transcript(input_text)
265
+ cache_manager.store_transcript(video_id, transcript)
266
+ else:
267
+ transcript = download_transcript(input_text)
268
+ cache_manager.store_transcript(video_id, transcript)
269
+ else:
270
+ print("Using provided text directly...")
271
+ transcript = input_text
272
+
273
+ # Process the transcript to generate the note
274
+ note_output = process_input(input_text, form_type, input_type, audio_file, force_refresh)
275
+
276
+ return transcript, note_output
277
+
278
+ except Exception as e:
279
+ error_msg = f"Error: {str(e)}"
280
+ return error_msg, error_msg
281
+
282
+ # Handle generate button click
283
+ generate_btn.click(
284
+ fn=process_and_show_transcript,
285
+ inputs=[input_text, form_type, input_type, audio_input],
286
+ outputs=[transcript_output, note_output]
287
+ )
288
+
289
+ # Example inputs
290
+ try:
291
+ with open("data/sample_note.txt", "r") as f:
292
+ sample_text = f.read()
293
+ except FileNotFoundError:
294
+ sample_text = "Sample therapy session transcript..."
295
+
296
+ gr.Examples(
297
+ examples=[
298
+ # Text example
299
+ [sample_text, "SOAP", "text", None],
300
+ # YouTube examples
301
+ ["https://www.youtube.com/watch?v=KuHLL2AE-SE", "DAP", "youtube", None],
302
+ ["https://www.youtube.com/watch?v=jS1KE3_Pqlc", "SOAPIER", "youtube", None],
303
+ # Audio example
304
+ [None, "BIRP", "audio", "data/CBT Role-Play.mp3"]
305
+ ],
306
+ inputs=[input_text, form_type, input_type, audio_input],
307
+ outputs=[transcript_output, note_output],
308
+ fn=process_and_show_transcript,
309
+ cache_examples=False,
310
+ label="Example Inputs",
311
+ examples_per_page=4
312
+ )
313
+
314
+ def login(username: str, password: str):
315
+ """Handle login and return updates for UI components."""
316
+ if check_auth(username, password):
317
+ return [
318
+ gr.update(visible=True), # main_interface
319
+ gr.update(value="✅ Login successful!", visible=True), # login_msg
320
+ gr.update(visible=False), # username
321
+ gr.update(visible=False), # password
322
+ gr.update(visible=False), # login_btn
323
+ ]
324
+ else:
325
+ return [
326
+ gr.update(visible=False), # main_interface
327
+ gr.update(value="❌ Invalid credentials", visible=True), # login_msg
328
+ gr.update(), # username - no change
329
+ gr.update(), # password - no change
330
+ gr.update(), # login_btn - no change
331
+ ]
332
+
333
+ login_btn.click(
334
+ fn=login,
335
+ inputs=[username, password],
336
+ outputs=[main_interface, login_msg, username, password, login_btn]
337
+ )
338
+
339
+ return demo
340
+
341
+ if __name__ == "__main__":
342
+ # Clean up any existing Gradio cache
343
+ cache_manager.cleanup_gradio_cache()
344
+
345
+ demo = create_ui()
346
+ demo.launch(
347
+ server_name="0.0.0.0",
348
+ server_port=7860,
349
+ share=True,
350
+ show_error=True,
351
+ auth=None # We're using our own auth system instead of Gradio's
352
+ )
config/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
config/auth.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import os
3
+ from pathlib import Path
4
+ import yaml
5
+ from typing import Dict
6
+
7
+ def load_auth_credentials() -> Dict[str, str]:
8
+ """Load authentication credentials from YAML file."""
9
+ auth_file = Path("config/credentials.yaml")
10
+
11
+ if not auth_file.exists():
12
+ # Create default credentials file if it doesn't exist
13
+ default_auth = {
14
+ "credentials": {
15
+ "admin": os.environ.get("ADMIN_PASSWORD", "change_this_password"),
16
+ }
17
+ }
18
+ auth_file.parent.mkdir(parents=True, exist_ok=True)
19
+ with open(auth_file, "w") as f:
20
+ yaml.dump(default_auth, f)
21
+
22
+ with open(auth_file, "r") as f:
23
+ auth_data = yaml.safe_load(f)
24
+
25
+ return auth_data.get("credentials", {})
config/settings.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import os
3
+ from dotenv import load_dotenv
4
+ # Load environment variables from .env file
5
+ load_dotenv()
6
+
7
+ class Settings:
8
+ OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "")
9
+ LANGCHAIN_API_KEY = os.environ.get("LANGCHAIN_API_KEY", "")
10
+ GOOGLE_API_KEY = os.environ.get("GOOGLE_API_KEY", "")
11
+ DEEPGRAM_API_KEY = os.environ.get("DEEPGRAM_API_KEY", "")
12
+
13
+ # Provider can be "openai" or "google_gemini"
14
+ MODEL_PROVIDER = os.environ.get("MODEL_PROVIDER", "openai") # "openai"
15
+
16
+ # Default model names
17
+ OPENAI_MODEL_NAME = "gpt-4o-mini"
18
+ GEMINI_MODEL_NAME = "gemini-2.0-flash-exp"
19
+
20
+ settings = Settings()
data/sample_note.txt ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Client: Jane Doe, a 38-year-old female client who has been attending weekly therapy sessions for generalized anxiety and relationship stress.
2
+ Therapist: Dr. Smith, a licensed clinical psychologist
3
+ Date/Time: October 10, 2023, 2:00 PM - 2:45 PM
4
+ Modality: Video telehealth session (client in private home office, therapist in private office)
5
+
6
+ Presenting Problem:
7
+ Jane has been experiencing increased anxiety, irritability, and difficulty concentrating on personal and professional tasks. She is concerned about her ability to manage stress related to upcoming changes at work and tension in her relationship with her partner.
8
+
9
+ Sample Transcript & Observations:
10
+
11
+ 2:00 PM - 2:05 PM (Check-in & Rapport Building)
12
+ • Therapist: “Hi Jane, it’s good to see you again. How have you been since our last session?”
13
+ • Client (Jane): “I’ve felt pretty overwhelmed this past week. My workload at the office nearly doubled, and my partner and I had a few arguments about household responsibilities.”
14
+
15
+ Therapist’s Notes (Not said aloud): Jane appears somewhat tired; she’s rubbing her temples and has a tense expression. She’s making consistent eye contact but is fidgeting with a pen.
16
+
17
+ 2:05 PM - 2:15 PM (Exploration of Symptoms)
18
+ • Therapist: “You mentioned feeling overwhelmed. Can you tell me more about what’s been making you feel that way?”
19
+ • Client: “At work, I’m worried I can’t keep up. My manager just assigned three new projects. I’m not sleeping well because I’m anxious about meeting deadlines. I wake up around 4:00 AM every day, heart racing.”
20
+ • Therapist: “You mentioned last session that you were using some deep breathing techniques. How has that been going?”
21
+ • Client: “I tried once this week, but I felt too restless. I ended up just scrolling through my phone instead, which probably made it worse.”
22
+
23
+ Therapist’s Notes: Jane reports continued anxiety, difficulty sleeping (waking early and feeling restless), and shows signs of muscle tension (clenched jaw, rubbing neck).
24
+
25
+ 2:15 PM - 2:25 PM (Discussing Relationship Stress)
26
+ • Therapist: “You said you had a few arguments with your partner. Can you share what led to those conflicts?”
27
+ • Client: “We’ve been arguing about chores and who’s responsible for what. I feel like I’m doing most of the housework. My partner says I’m too critical and not asking for help directly. I guess I’m not communicating well.”
28
+ • Therapist: “Have you tried implementing any of the communication strategies we discussed last time, like using ‘I’ statements or scheduling a set time to talk about chores?”
29
+ • Client: “I tried once, but it felt forced. I ended up just complaining about how stressed I am. I know that didn’t help.”
30
+
31
+ Therapist’s Notes: Jane acknowledges difficulty implementing previously discussed communication strategies and expresses guilt and frustration about these interactions.
32
+
33
+ 2:25 PM - 2:35 PM (Coping Strategies & Goals)
34
+ • Therapist: “It sounds like the stress at work and home is contributing to your anxiety. Let’s revisit some coping techniques. We talked about structured problem-solving and brief relaxation exercises. Are there any moments during the day you could schedule a short break to practice deep breathing or a quick mindfulness exercise?”
35
+ • Client: “I think I could try taking a five-minute break mid-morning. Maybe stepping away from my desk and doing some guided breathing could help.”
36
+ • Therapist: “Great. Let’s also consider a small goal for communication at home. Perhaps one evening this week, you could let your partner know when you’re feeling overwhelmed before it escalates. You could say, ‘I’m feeling anxious about work and need a few minutes to gather my thoughts.’”
37
+ • Client: “I can try that. I don’t want to keep arguing. I want to feel more in control of these situations.”
38
+
39
+ Therapist’s Notes: Jane is willing to identify a concrete step: one structured break during work and one proactive communication attempt at home. She appears motivated yet still uncertain.
40
+
41
+ 2:35 PM - 2:40 PM (Review of Mood & Safety)
42
+ • Therapist: “On a scale of 0-10, how would you rate your anxiety right now?”
43
+ • Client: “Maybe a 6. It was about an 8 earlier in the week.”
44
+ • Therapist: “Any thoughts of self-harm or harm to others since we last spoke?”
45
+ • Client: “No, I’ve had no suicidal thoughts. It’s just stress and worry, not that kind of feeling.”
46
+ • Therapist: “Okay, that’s good to know. Are you still taking your medication as prescribed by your psychiatrist?”
47
+ • Client: “Yes, I’ve been consistent with my SSRI. I think it helps a bit.”
48
+
49
+ Therapist’s Notes: Jane denies self-harm ideation. Anxiety rating is moderately high but lower than peak for the week.
50
+
51
+ 2:40 PM - 2:45 PM (Session Wrap-Up & Next Steps)
52
+ • Therapist: “In our next session, we can check in on how the mid-morning breaks and the proactive communication attempt went. Let’s also consider practicing a brief relaxation exercise together next time.”
53
+ • Client: “That sounds good. I’ll try to be more consistent with those breaks and let you know how it goes.”
54
+ • Therapist: “Great. See you next week at the same time.”
55
+ • Client: “Thank you, see you then.”
56
+
57
+ Supplementary Data/Measures:
58
+ • PHQ-9: Administered at intake, last score was 9 (mild-moderate depression symptoms). Not administered this session but client reports stable mood with primarily anxiety-driven symptoms.
59
+ • GAD-7: Last recorded score was 12, suggesting moderate anxiety. Client’s subjective rating today is a 6/10 at session’s end.
60
+
61
+ No presence of family members noted this session. Client was alone in a private space.
forms/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
forms/schemas.py ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from pydantic import BaseModel, Field
3
+
4
+ from typing import Optional, List
5
+ from pydantic import BaseModel, Field
6
+
7
+ class SOAPNote(BaseModel):
8
+ """
9
+ A SOAP note is structured into four sections: Subjective, Objective, Assessment, and Plan.
10
+ These fields help track the client's self-reported experiences, observable data, the clinician's assessment, and the future treatment plan.
11
+ """
12
+ Subjective: Optional[str] = Field(None, description="Client's self-reported symptoms, emotions, concerns, and relevant personal history.")
13
+ Objective: Optional[str] = Field(None, description="Observable and measurable data, such as behavior, affect, test results, or vital signs.")
14
+ Assessment: Optional[str] = Field(None, description="Clinician's interpretation of the subjective and objective data, including diagnosis and progress.")
15
+ Plan: Optional[str] = Field(None, description="Outline of next steps, changes to treatment, referrals, and any planned interventions.")
16
+
17
+ class DAPNote(BaseModel):
18
+ """
19
+ A DAP note includes Data, Assessment, and Plan. It condenses subjective and objective info into a single 'Data' section.
20
+ """
21
+ Data: Optional[str] = Field(None, description="Combined subjective and objective information: client's statements, therapist observations, relevant tests.")
22
+ Assessment: Optional[str] = Field(None, description="Therapist's interpretation of the data, clinical impressions, and identified issues.")
23
+ Plan: Optional[str] = Field(None, description="Next steps, goals for future sessions, and recommended interventions or activities.")
24
+
25
+ class BIRPNote(BaseModel):
26
+ """
27
+ A BIRP note includes Behavior, Intervention, Response, and Plan, emphasizing the therapist's interventions and the client's reaction.
28
+ """
29
+ Behavior: Optional[str] = Field(None, description="Client's behavior during the session (verbal/non-verbal) and any observations made by the therapist.")
30
+ Intervention: Optional[str] = Field(None, description="Specific techniques, methods, or therapies used by the clinician during the session.")
31
+ Response: Optional[str] = Field(None, description="How the client responded to the interventions, including changes in affect, participation, or symptom relief.")
32
+ Plan: Optional[str] = Field(None, description="Follow-up steps, homework assignments, referrals, or next session focus.")
33
+
34
+ class PIRPNote(BaseModel):
35
+ """
36
+ A PIRP note is Problem, Intervention, Response, and Plan, focusing on a particular client problem.
37
+ """
38
+ Problem: Optional[str] = Field(None, description="The client's presenting problem, symptoms, or reason for seeking therapy.")
39
+ Intervention: Optional[str] = Field(None, description="Actions taken by the therapist to address the identified problem.")
40
+ Response: Optional[str] = Field(None, description="Client's reaction or changes after the intervention was applied.")
41
+ Plan: Optional[str] = Field(None, description="Next steps for addressing the problem, including future sessions, techniques, or referrals.")
42
+
43
+ class GIRPNote(BaseModel):
44
+ """
45
+ A GIRP note focuses on Goals, Intervention, Response, and Plan, centering around client-defined goals.
46
+ """
47
+ Goals: Optional[str] = Field(None, description="The client's short-term and long-term therapy goals or objectives.")
48
+ Intervention: Optional[str] = Field(None, description="Therapeutic interventions used to help the client work toward these goals.")
49
+ Response: Optional[str] = Field(None, description="How the client responded to the interventions and their progress toward goals.")
50
+ Plan: Optional[str] = Field(None, description="Plan for future sessions, homework, referrals, or adjustments to help achieve goals.")
51
+
52
+ class SIRPNote(BaseModel):
53
+ """
54
+ A SIRP note organizes notes by Situation, Intervention, Response, and Plan, emphasizing the client's current situation.
55
+ """
56
+ Situation: Optional[str] = Field(None, description="The client's presenting situation, including current symptoms, concerns, and background info.")
57
+ Intervention: Optional[str] = Field(None, description="Interventions, assessments, and recommendations made during the session.")
58
+ Response: Optional[str] = Field(None, description="Client's response to the intervention, observed changes or feedback.")
59
+ Plan: Optional[str] = Field(None, description="Next steps, follow-up appointments, referrals, and any planned adjustments.")
60
+
61
+ class FAIRFDARPNote(BaseModel):
62
+ """
63
+ A FAIR/F-DARP note includes Focus, Assessment, Intervention, Response (FAIR)
64
+ or Focus, Data, Action, Response, Plan (F-DARP).
65
+ Here we combine them: Focus, Data, Action, Response, (and optionally Plan).
66
+ """
67
+ Focus: Optional[str] = Field(None, description="Focus of the note, such as a nursing diagnosis, event, or primary concern.")
68
+ Data: Optional[str] = Field(None, description="Subjective and objective data about the client/patient condition.")
69
+ Action: Optional[str] = Field(None, description="Actions taken by the provider in response to the data (e.g., treatments, education).")
70
+ Response: Optional[str] = Field(None, description="Client's response to the actions taken.")
71
+ Plan: Optional[str] = Field(None, description="Future steps or follow-up if using the full F-DARP format.")
72
+
73
+ class DARENote(BaseModel):
74
+ """
75
+ A DARE note stands for Data, Action, Response, Education. Emphasizes client education and their response.
76
+ """
77
+ Data: Optional[str] = Field(None, description="Subjective and objective client information and therapist's observations.")
78
+ Action: Optional[str] = Field(None, description="Specific actions, treatments, or interventions the therapist took.")
79
+ Response: Optional[str] = Field(None, description="Client's response to those actions, improvements, or changes in symptoms.")
80
+ Education: Optional[str] = Field(None, description="Education provided to the client about their condition, treatments, or coping strategies.")
81
+
82
+ class PIENote(BaseModel):
83
+ """
84
+ A PIE note: Problem, Intervention, Evaluation. It's similar to PIRP but focuses on evaluating interventions.
85
+ """
86
+ Problem: Optional[str] = Field(None, description="Client's identified problem, whether mental health symptom or behavior issue.")
87
+ Intervention: Optional[str] = Field(None, description="What the therapist did to address the problem (techniques, strategies).")
88
+ Evaluation: Optional[str] = Field(None, description="How effective the intervention was, changes in the client, and next steps.")
89
+
90
+ class SOAPIERNote(BaseModel):
91
+ """
92
+ A SOAPIER note expands SOAP by adding Intervention, Evaluation, and Revision sections for more comprehensive documentation.
93
+ """
94
+ Subjective: Optional[str] = Field(None, description="Client's subjective complaints, feelings, statements.")
95
+ Objective: Optional[str] = Field(None, description="Observable, measurable data, test results, or observations.")
96
+ Assessment: Optional[str] = Field(None, description="Therapist's interpretation, diagnosis, or clinical judgment.")
97
+ Plan: Optional[str] = Field(None, description="Proposed interventions, follow-ups, or referrals.")
98
+ Intervention: Optional[str] = Field(None, description="Specific interventions implemented during the session.")
99
+ Evaluation: Optional[str] = Field(None, description="Client's response to interventions and progress made.")
100
+ Revision: Optional[str] = Field(None, description="Adjustments to the treatment plan based on evaluation.")
101
+
102
+ class SOAPIENote(BaseModel):
103
+ """
104
+ A SOAPIE note is similar to SOAPIER but only adds Intervention and Evaluation to the standard SOAP note.
105
+ """
106
+ Subjective: Optional[str] = Field(None, description="Client's self-reported experiences and symptoms.")
107
+ Objective: Optional[str] = Field(None, description="Observable data and measurable findings.")
108
+ Assessment: Optional[str] = Field(None, description="Clinician's interpretation and clinical impressions.")
109
+ Plan: Optional[str] = Field(None, description="Planned interventions, referrals, or changes.")
110
+ Intervention: Optional[str] = Field(None, description="Interventions used during the session.")
111
+ Evaluation: Optional[str] = Field(None, description="Client's response to interventions and progress toward goals.")
112
+
113
+ class POMRNote(BaseModel):
114
+ """
115
+ POMR: Problem-Oriented Medical Record. Focuses on organizing data around problems.
116
+ """
117
+ Database: Optional[str] = Field(None, description="Patient's history, exam findings, and relevant tests.")
118
+ ProblemList: Optional[str] = Field(None, description="All identified problems, both active and resolved.")
119
+ InitialPlan: Optional[str] = Field(None, description="Initial plan to address each problem, including diagnostics or treatments.")
120
+ ProgressNotes: Optional[str] = Field(None, description="Ongoing progress, changes, and outcomes related to each problem.")
121
+
122
+ class NarrativeNote(BaseModel):
123
+ """
124
+ A Narrative note is a free-text record, providing flexibility for a descriptive, story-like documentation.
125
+ """
126
+ Narrative: Optional[str] = Field(None, description="A free-form description of the session, events, observations, and client interactions.")
127
+
128
+ class CBENote(BaseModel):
129
+ """
130
+ CBE: Charting By Exception. Only notes deviations from the norm.
131
+ """
132
+ Exceptions: Optional[str] = Field(None, description="Significant changes or unexpected findings from the norm, highlighting what differs.")
133
+
134
+ class SBARNote(BaseModel):
135
+ """
136
+ SBAR: Situation, Background, Assessment, and Recommendation. Used often in quick communication contexts.
137
+ """
138
+ Situation: Optional[str] = Field(None, description="Brief description of the patient's current situation or issue.")
139
+ Background: Optional[str] = Field(None, description="Relevant background information, history, current meds, or past sessions.")
140
+ Assessment: Optional[str] = Field(None, description="Clinician's assessment of the current condition or problem.")
141
+ Recommendation: Optional[str] = Field(None, description="Suggested next steps, treatments, referrals, or actions.")
142
+
143
+ class ExtractedNotes(BaseModel):
144
+ """Container for multiple note formats."""
145
+ soap: SOAPNote | None = None
146
+ dap: DAPNote | None = None
147
+ birp: BIRPNote | None = None
148
+ pirp: PIRPNote | None = None
149
+ girp: GIRPNote | None = None
150
+ sirp: SIRPNote | None = None
151
+ fairfdarp: FAIRFDARPNote | None = None
152
+ dare: DARENote | None = None
153
+ pie: PIENote | None = None
154
+ soapiier: SOAPIERNote | None = None
155
+ soapiie: SOAPIENote | None = None
156
+ pomr: POMRNote | None = None
157
+ narrative: NarrativeNote | None = None
158
+ cbe: CBENote | None = None
159
+ sbar: SBARNote | None = None
langhub/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
langhub/prompts/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
langhub/prompts/therapy_extraction_prompt.yaml ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ prompts:
2
+ soap:
3
+ system: |
4
+ You are an expert therapist assistant. Extract a SOAP note from the following therapy session transcript.
5
+ human: |
6
+ Please follow the SOAPNote schema strictly:
7
+ - Subjective: The client's own words about their feelings, symptoms, or concerns.
8
+ - Objective: The therapist's direct observations, measurable data, or test results.
9
+ - Assessment: Your clinical interpretation of the client's situation, progress, and any diagnostic impressions.
10
+ - Plan: The next steps to be taken, including future interventions, referrals, or adjustments in therapy.
11
+
12
+ If any of the fields are not mentioned, return them as null.
13
+
14
+ Transcript:
15
+ {text}
16
+
17
+ dap:
18
+ system: |
19
+ You are an expert therapist assistant. Extract a DAP note from the following therapy session transcript.
20
+ human: |
21
+ You are a mental health professional. Based on the transcript, produce a DAP note:
22
+
23
+ - Data: Include both subjective (client's words/feelings) and objective (observed behaviors, test results) in a factual manner.
24
+ - Assessment: Your interpretation, impression, or diagnosis based on the data.
25
+ - Plan: Outline next steps, goals, or interventions planned.
26
+
27
+ If any section isn't applicable, return it as null.
28
+
29
+ Transcript:
30
+ {text}
31
+
32
+ birp:
33
+ system: |
34
+ You are an expert therapist assistant. Extract a BIRP note from the following therapy session transcript.
35
+ human: |
36
+ Please extract a BIRP note following this structure:
37
+
38
+ - Behavior: Client's behavior during the session (verbal/non-verbal) and observations.
39
+ - Intervention: Specific techniques and methods used by the therapist.
40
+ - Response: How the client responded to interventions.
41
+ - Plan: Follow-up steps and future recommendations.
42
+
43
+ If any section isn't applicable, return it as null.
44
+
45
+ Transcript:
46
+ {text}
47
+
48
+ pirp:
49
+ system: |
50
+ You are an expert therapist assistant. Extract a PIRP note from the following therapy session transcript.
51
+ human: |
52
+ Please extract a PIRP note following this structure:
53
+
54
+ - Problem: The client's presenting problem, symptoms, or reason for seeking therapy.
55
+ - Intervention: Actions taken by the therapist to address the identified problem.
56
+ - Response: Client's reaction or changes after the intervention was applied.
57
+ - Plan: Next steps for addressing the problem, including future sessions, techniques, or referrals.
58
+
59
+ If any section isn't applicable, return it as null.
60
+
61
+ Transcript:
62
+ {text}
63
+
64
+ girp:
65
+ system: |
66
+ You are an expert therapist assistant. Extract a GIRP note from the following therapy session transcript.
67
+ human: |
68
+ Please extract a GIRP note following this structure:
69
+
70
+ - Goals: The client's short-term and long-term therapy goals or objectives.
71
+ - Intervention: Therapeutic interventions used to help the client work toward these goals.
72
+ - Response: How the client responded to the interventions and their progress toward goals.
73
+ - Plan: Plan for future sessions, homework, referrals, or adjustments to help achieve goals.
74
+
75
+ If any section isn't applicable, return it as null.
76
+
77
+ Transcript:
78
+ {text}
79
+
80
+ sirp:
81
+ system: |
82
+ You are an expert therapist assistant. Extract a SIRP note from the following therapy session transcript.
83
+ human: |
84
+ Please extract a SIRP note following this structure:
85
+
86
+ - Situation: The client's presenting situation, including current symptoms, concerns, and background info.
87
+ - Intervention: Interventions, assessments, and recommendations made during the session.
88
+ - Response: Client's response to the intervention, observed changes or feedback.
89
+ - Plan: Next steps, follow-up appointments, referrals, and any planned adjustments.
90
+
91
+ If any section isn't applicable, return it as null.
92
+
93
+ Transcript:
94
+ {text}
95
+
96
+ fair_f_darp:
97
+ system: |
98
+ You are an expert therapist assistant. Extract a FAIR/F-DARP note from the following therapy session transcript.
99
+ human: |
100
+ Please extract a FAIR/F-DARP note following this structure:
101
+
102
+ - Focus: Focus of the note, such as a nursing diagnosis, event, or primary concern.
103
+ - Data: Subjective and objective data about the client/patient condition.
104
+ - Action: Actions taken by the provider in response to the data.
105
+ - Response: Client's response to the actions taken.
106
+ - Plan: Future steps or follow-up if using the full F-DARP format.
107
+
108
+ If any section isn't applicable, return it as null.
109
+
110
+ Transcript:
111
+ {text}
112
+
113
+ dare:
114
+ system: |
115
+ You are an expert therapist assistant. Extract a DARE note from the following therapy session transcript.
116
+ human: |
117
+ Please extract a DARE note following this structure:
118
+
119
+ - Data: Subjective and objective client information and therapist's observations.
120
+ - Action: Specific actions, treatments, or interventions the therapist took.
121
+ - Response: Client's response to those actions, improvements, or changes in symptoms.
122
+ - Education: Education provided to the client about their condition, treatments, or coping strategies.
123
+
124
+ If any section isn't applicable, return it as null.
125
+
126
+ Transcript:
127
+ {text}
128
+
129
+ pie:
130
+ system: |
131
+ You are an expert therapist assistant. Extract a PIE note from the following therapy session transcript.
132
+ human: |
133
+ Please extract a PIE note following this structure:
134
+
135
+ - Problem: Client's identified problem, whether mental health symptom or behavior issue.
136
+ - Intervention: What the therapist did to address the problem (techniques, strategies).
137
+ - Evaluation: How effective the intervention was, changes in the client, and next steps.
138
+
139
+ If any section isn't applicable, return it as null.
140
+
141
+ Transcript:
142
+ {text}
143
+
144
+ soapier:
145
+ system: |
146
+ You are an expert therapist assistant. Extract a SOAPIER note from the following therapy session transcript.
147
+ human: |
148
+ Please extract a SOAPIER note following this structure:
149
+
150
+ - Subjective: Client's subjective complaints, feelings, statements.
151
+ - Objective: Observable, measurable data, test results, or observations.
152
+ - Assessment: Therapist's interpretation, diagnosis, or clinical judgment.
153
+ - Plan: Proposed interventions, follow-ups, or referrals.
154
+ - Intervention: Specific interventions implemented during the session.
155
+ - Evaluation: Client's response to interventions and progress made.
156
+ - Revision: Adjustments to the treatment plan based on evaluation.
157
+
158
+ If any section isn't applicable, return it as null.
159
+
160
+ Transcript:
161
+ {text}
162
+
163
+ soapie:
164
+ system: |
165
+ You are an expert therapist assistant. Extract a SOAPIE note from the following therapy session transcript.
166
+ human: |
167
+ Please extract a SOAPIE note following this structure:
168
+
169
+ - Subjective: Client's self-reported experiences and symptoms.
170
+ - Objective: Observable data and measurable findings.
171
+ - Assessment: Clinician's interpretation and clinical impressions.
172
+ - Plan: Planned interventions, referrals, or changes.
173
+ - Intervention: Interventions used during the session.
174
+ - Evaluation: Client's response to interventions and progress toward goals.
175
+
176
+ If any section isn't applicable, return it as null.
177
+
178
+ Transcript:
179
+ {text}
180
+
181
+ pomr:
182
+ system: |
183
+ You are an expert therapist assistant. Extract a POMR note from the following therapy session transcript.
184
+ human: |
185
+ Please extract a POMR note following this structure:
186
+
187
+ - Database: Patient's history, exam findings, and relevant tests.
188
+ - ProblemList: All identified problems, both active and resolved.
189
+ - InitialPlan: Initial plan to address each problem, including diagnostics or treatments.
190
+ - ProgressNotes: Ongoing progress, changes, and outcomes related to each problem.
191
+
192
+ If any section isn't applicable, return it as null.
193
+
194
+ Transcript:
195
+ {text}
196
+
197
+ narrative:
198
+ system: |
199
+ You are an expert therapist assistant. Extract a Narrative note from the following therapy session transcript.
200
+ human: |
201
+ Please extract a Narrative note following this structure:
202
+
203
+ - Narrative: A free-form description of the session, events, observations, and client interactions.
204
+
205
+ If the section isn't applicable, return it as null.
206
+
207
+ Transcript:
208
+ {text}
209
+
210
+ cbe:
211
+ system: |
212
+ You are an expert therapist assistant. Extract a CBE note from the following therapy session transcript.
213
+ human: |
214
+ Please extract a CBE note following this structure:
215
+
216
+ - Exceptions: Significant changes or unexpected findings from the norm, highlighting what differs.
217
+
218
+ If the section isn't applicable, return it as null.
219
+
220
+ Transcript:
221
+ {text}
222
+
223
+ sbar:
224
+ system: |
225
+ You are an expert therapist assistant. Extract a SBAR note from the following therapy session transcript.
226
+ human: |
227
+ Please extract a SBAR note following this structure:
228
+
229
+ - Situation: Brief description of the patient's current situation or issue.
230
+ - Background: Relevant background information, history, current meds, or past sessions.
231
+ - Assessment: Clinician's assessment of the current condition or problem.
232
+ - Recommendation: Suggested next steps, treatments, referrals, or actions.
233
+
234
+ If any section isn't applicable, return it as null.
235
+
236
+ Transcript:
237
+ {text}
main.py ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import os
3
+ from pathlib import Path
4
+ import yaml
5
+
6
+ from langchain_core.prompts import ChatPromptTemplate
7
+ from langchain_core.messages import HumanMessage, SystemMessage
8
+ from langchain_core.runnables import RunnableSequence
9
+ from langgraph.prebuilt import ValidationNode
10
+
11
+ from config.settings import settings
12
+ from forms.schemas import ExtractedNotes, SOAPNote, DAPNote, BIRPNote, PIRPNote, GIRPNote, SIRPNote, FAIRFDARPNote, DARENote, PIENote, SOAPIERNote, SOAPIENote, POMRNote, NarrativeNote, CBENote, SBARNote
13
+ from utils.youtube import download_transcript
14
+ from utils.text_processing import chunk_text
15
+ from models.llm_provider import get_llm
16
+ from langchain.globals import set_llm_cache
17
+ from langchain.cache import SQLiteCache
18
+
19
+ set_llm_cache(SQLiteCache(database_path=".langchain.db"))
20
+
21
+ from dotenv import load_dotenv
22
+
23
+ load_dotenv()
24
+
25
+ # Set environment for LangSmith tracing/logging
26
+ os.environ["LANGCHAIN_TRACING_V2"] = "true"
27
+ if settings.LANGCHAIN_API_KEY:
28
+ os.environ["LANGCHAIN_API_KEY"] = settings.LANGCHAIN_API_KEY
29
+
30
+ def load_prompt(note_type: str) -> tuple[str, str]:
31
+ """Load the prompt template from YAML for the specified note type."""
32
+ prompt_path = Path("langhub/prompts/therapy_extraction_prompt.yaml")
33
+ with open(prompt_path, "r") as f:
34
+ data = yaml.safe_load(f)
35
+
36
+ note_prompts = data.get("prompts", {}).get(note_type.lower())
37
+ if not note_prompts:
38
+ raise ValueError(f"No prompt template found for note type: {note_type}")
39
+
40
+ return note_prompts["system"], note_prompts["human"]
41
+
42
+ def create_extraction_chain(note_type: str = "soap") -> RunnableSequence:
43
+ """Create a chain for extracting structured notes."""
44
+ print(f"Creating extraction chain for {note_type.upper()} notes...")
45
+
46
+ print("Initializing LLM...")
47
+ llm = get_llm()
48
+
49
+ print("Setting up schema mapping...")
50
+ # Select the appropriate schema based on note type
51
+ schema_map = {
52
+ "soap": SOAPNote,
53
+ "dap": DAPNote,
54
+ "birp": BIRPNote,
55
+ "birp_raw": BIRPNote,
56
+ "pirp": PIRPNote,
57
+ "girp": GIRPNote,
58
+ "sirp": SIRPNote,
59
+ "fair_fdarp": FAIRFDARPNote,
60
+ "dare": DARENote,
61
+ "pie": PIENote,
62
+ "soapier": SOAPIERNote,
63
+ "soapie": SOAPIENote,
64
+ "pomr": POMRNote,
65
+ "narrative": NarrativeNote,
66
+ "cbe": CBENote,
67
+ "sbar": SBARNote
68
+ }
69
+ schema = schema_map.get(note_type.lower())
70
+
71
+ if not schema:
72
+ raise ValueError(f"Unsupported note type: {note_type}")
73
+
74
+ print("Creating structured LLM output...")
75
+ # Create structured LLM
76
+ structured_llm = llm.with_structured_output(schema=schema, include_raw=True)
77
+
78
+ print("Loading system prompt...")
79
+ # Load system prompt and human prompt for the specific note type
80
+ system_prompt, human_prompt = load_prompt(note_type)
81
+
82
+ print("Creating prompt template...")
83
+ # Create prompt template
84
+ prompt_template = ChatPromptTemplate.from_messages([
85
+ ("system", system_prompt),
86
+ ("human", human_prompt)
87
+ ])
88
+
89
+ print("Building extraction chain...")
90
+ # Create the chain
91
+ chain = prompt_template | structured_llm
92
+
93
+ print("Extraction chain created successfully")
94
+ return chain
95
+
96
+ def process_session(url: str, note_type: str = "soap") -> dict:
97
+ """Process a single therapy session."""
98
+ try:
99
+ # Download transcript
100
+ print(f"Downloading transcript from {url}...")
101
+ transcript = download_transcript(url)
102
+
103
+ # Create extraction chain
104
+ chain = create_extraction_chain(note_type)
105
+
106
+ # Process transcript
107
+ print("Extracting structured notes...")
108
+ result = chain.invoke({
109
+ "note_type": note_type.upper(),
110
+ "text": transcript
111
+ })
112
+
113
+ return result.model_dump()
114
+
115
+ except Exception as e:
116
+ print(f"Error processing session: {str(e)}")
117
+ return {}
118
+
119
+ def main():
120
+ # Example YouTube sessions
121
+ sessions = [
122
+ {
123
+ "title": "CBT Role-Play – Complete Session – Part 6",
124
+ "url": "https://www.youtube.com/watch?v=KuHLL2AE-SE"
125
+ },
126
+ {
127
+ "title": "CBT Role-Play – Complete Session – Part 7",
128
+ "url": "https://www.youtube.com/watch?v=jS1KE3_Pqlc"
129
+ }
130
+ ]
131
+
132
+ for session in sessions:
133
+ print(f"\nProcessing session: {session['title']}")
134
+
135
+ # Extract notes in different formats
136
+ note_types = ["soap", "dap", "birp"]
137
+ results = {}
138
+
139
+ for note_type in note_types:
140
+ print(f"\nExtracting {note_type.upper()} notes...")
141
+ result = process_session(session["url"], note_type)
142
+ results[note_type] = result
143
+
144
+ # Print results
145
+ print(f"\nResults for '{session['title']}':")
146
+ for note_type, notes in results.items():
147
+ print(f"\n{note_type.upper()} Notes:")
148
+ print(yaml.dump(notes, default_flow_style=False))
149
+
150
+ if __name__ == "__main__":
151
+ main()
models/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
models/llm_provider.py ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from config.settings import settings
3
+ from langchain_openai import ChatOpenAI
4
+ from langchain_google_genai import ChatGoogleGenerativeAI
5
+
6
+ def get_model_identifier(llm) -> str:
7
+ """Get a unique identifier for the model."""
8
+ if isinstance(llm, ChatOpenAI):
9
+ return f"openai-{llm.model_name}"
10
+ elif isinstance(llm, ChatGoogleGenerativeAI):
11
+ return f"gemini-{settings.GEMINI_MODEL_NAME}"
12
+ else:
13
+ return "unknown-model"
14
+
15
+ def get_llm(model_name: str | None = None):
16
+ """
17
+ Return an LLM instance based on the configured provider.
18
+ """
19
+ provider = settings.MODEL_PROVIDER
20
+
21
+ if provider == "openai":
22
+ model_name = model_name or settings.OPENAI_MODEL_NAME
23
+ if not settings.OPENAI_API_KEY:
24
+ raise ValueError("OPENAI_API_KEY is not set")
25
+ llm = ChatOpenAI(
26
+ model=model_name,
27
+ openai_api_key=settings.OPENAI_API_KEY,
28
+ temperature=0,
29
+ )
30
+ elif provider == "google_gemini":
31
+ model_name = model_name or settings.GEMINI_MODEL_NAME
32
+ if not settings.GOOGLE_API_KEY:
33
+ raise ValueError("GOOGLE_API_KEY is not set")
34
+ llm = ChatGoogleGenerativeAI(
35
+ model=model_name,
36
+ temperature=0,
37
+ max_tokens=None,
38
+ max_retries=2,
39
+ )
40
+ else:
41
+ raise ValueError(f"Unknown model provider: {provider}")
42
+
43
+ return llm
requirements.txt ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ langchain
2
+ langchain-core
3
+ langchain-openai>=0.1.0
4
+ langgraph>=0.1.45
5
+ pydantic>=2.0.0
6
+ openai
7
+ youtube-transcript-api
8
+ pyyaml
9
+ langchain-google-genai
10
+ python-dotenv
11
+ langchain-community
12
+ gradio>=4.0.0
13
+ google-generativeai>=0.3.0
14
+ pydub
15
+ python-slugify
16
+ deepgram-sdk>=3.0,<4.0
utils/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
utils/audio.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import os
3
+ from pathlib import Path
4
+ from pydub import AudioSegment
5
+ from utils.transcription import TranscriptionService
6
+
7
+ # Initialize the transcription service
8
+ transcription_service = TranscriptionService()
9
+
10
+ def convert_audio_to_wav(audio_path: str | Path) -> str:
11
+ """Convert uploaded audio to WAV format if needed."""
12
+ audio_path = Path(audio_path)
13
+ output_path = audio_path.with_suffix('.wav')
14
+
15
+ if audio_path.suffix.lower() != '.wav':
16
+ print(f"Converting {audio_path.name} to WAV format...")
17
+ audio = AudioSegment.from_file(audio_path)
18
+ audio.export(output_path, format='wav')
19
+ return str(output_path)
20
+
21
+ return str(audio_path)
22
+
23
+ def transcribe_audio(audio_path: str | Path) -> str:
24
+ """
25
+ Transcribe audio using Deepgram.
26
+ Supports multiple audio formats, converts to WAV if needed.
27
+ """
28
+ try:
29
+ # Convert to WAV if needed
30
+ wav_path = convert_audio_to_wav(audio_path)
31
+
32
+ # Transcribe using Deepgram
33
+ transcript = transcription_service.transcribe_file(wav_path)
34
+
35
+ return transcript
36
+
37
+ except Exception as e:
38
+ raise Exception(f"Error transcribing audio: {str(e)}")
utils/cache.py ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import hashlib
3
+ import json
4
+ import sqlite3
5
+ from pathlib import Path
6
+ from typing import Any
7
+ from datetime import datetime
8
+
9
+ class CacheManager:
10
+ def __init__(self, cache_dir: str | Path = "cache"):
11
+ self.cache_dir = Path(cache_dir)
12
+ self.cache_dir.mkdir(parents=True, exist_ok=True)
13
+
14
+ # Create SQLite database for structured results
15
+ self.db_path = self.cache_dir / "extraction_cache.db"
16
+ self._init_db()
17
+
18
+ def _init_db(self):
19
+ """Initialize the SQLite database with necessary tables."""
20
+ with sqlite3.connect(self.db_path) as conn:
21
+ conn.execute("""
22
+ CREATE TABLE IF NOT EXISTS extractions (
23
+ input_hash TEXT,
24
+ form_type TEXT,
25
+ result TEXT,
26
+ model_name TEXT,
27
+ timestamp DATETIME,
28
+ PRIMARY KEY (input_hash, form_type)
29
+ )
30
+ """)
31
+
32
+ conn.execute("""
33
+ CREATE TABLE IF NOT EXISTS transcripts (
34
+ video_id TEXT PRIMARY KEY,
35
+ transcript TEXT,
36
+ timestamp DATETIME
37
+ )
38
+ """)
39
+
40
+ def _hash_content(self, content: str) -> str:
41
+ """Generate a stable hash for input content."""
42
+ return hashlib.sha256(content.encode('utf-8')).hexdigest()
43
+
44
+ def get_transcript(self, video_id: str) -> str | None:
45
+ """Retrieve a cached transcript if it exists."""
46
+ with sqlite3.connect(self.db_path) as conn:
47
+ cursor = conn.execute(
48
+ "SELECT transcript FROM transcripts WHERE video_id = ?",
49
+ (video_id,)
50
+ )
51
+ result = cursor.fetchone()
52
+ return result[0] if result else None
53
+
54
+ def store_transcript(self, video_id: str, transcript: str):
55
+ """Store a transcript in the cache."""
56
+ with sqlite3.connect(self.db_path) as conn:
57
+ conn.execute(
58
+ """
59
+ INSERT OR REPLACE INTO transcripts (video_id, transcript, timestamp)
60
+ VALUES (?, ?, ?)
61
+ """,
62
+ (video_id, transcript, datetime.now())
63
+ )
64
+
65
+ def get_extraction(
66
+ self,
67
+ input_content: str,
68
+ form_type: str,
69
+ model_name: str
70
+ ) -> dict | None:
71
+ """Retrieve cached extraction results if they exist."""
72
+ input_hash = self._hash_content(input_content)
73
+
74
+ with sqlite3.connect(self.db_path) as conn:
75
+ cursor = conn.execute(
76
+ """
77
+ SELECT result FROM extractions
78
+ WHERE input_hash = ? AND form_type = ? AND model_name = ?
79
+ """,
80
+ (input_hash, form_type, model_name)
81
+ )
82
+ result = cursor.fetchone()
83
+
84
+ if result:
85
+ return json.loads(result[0])
86
+ return None
87
+
88
+ def store_extraction(
89
+ self,
90
+ input_content: str,
91
+ form_type: str,
92
+ result: dict,
93
+ model_name: str
94
+ ):
95
+ """Store extraction results in the cache."""
96
+ input_hash = self._hash_content(input_content)
97
+
98
+ with sqlite3.connect(self.db_path) as conn:
99
+ conn.execute(
100
+ """
101
+ INSERT OR REPLACE INTO extractions
102
+ (input_hash, form_type, result, model_name, timestamp)
103
+ VALUES (?, ?, ?, ?, ?)
104
+ """,
105
+ (
106
+ input_hash,
107
+ form_type,
108
+ json.dumps(result),
109
+ model_name,
110
+ datetime.now()
111
+ )
112
+ )
113
+
114
+ def clear_cache(self, older_than_days: int | None = None):
115
+ """Clear the cache, optionally only entries older than specified days."""
116
+ with sqlite3.connect(self.db_path) as conn:
117
+ if older_than_days is not None:
118
+ conn.execute(
119
+ """
120
+ DELETE FROM extractions
121
+ WHERE timestamp < datetime('now', ?)
122
+ """,
123
+ (f'-{older_than_days} days',)
124
+ )
125
+ conn.execute(
126
+ """
127
+ DELETE FROM transcripts
128
+ WHERE timestamp < datetime('now', ?)
129
+ """,
130
+ (f'-{older_than_days} days',)
131
+ )
132
+ else:
133
+ conn.execute("DELETE FROM extractions")
134
+ conn.execute("DELETE FROM transcripts")
135
+
136
+ def cleanup_gradio_cache(self):
137
+ """Clean up Gradio's example cache directory."""
138
+ gradio_cache = Path(".gradio")
139
+ if gradio_cache.exists():
140
+ import shutil
141
+ shutil.rmtree(gradio_cache)
142
+ print("Cleaned up Gradio cache")
utils/text_processing.py ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ def chunk_text(text: str, chunk_size: int = 3000) -> list[str]:
4
+ """
5
+ Simple utility to chunk text into manageable pieces if needed
6
+ for long transcripts.
7
+ """
8
+ words = text.split()
9
+ chunks = []
10
+ for i in range(0, len(words), chunk_size):
11
+ chunks.append(" ".join(words[i:i+chunk_size]))
12
+ return chunks
utils/transcription.py ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from pathlib import Path
3
+ import httpx
4
+ from deepgram import (
5
+ DeepgramClient,
6
+ DeepgramClientOptions,
7
+ PrerecordedOptions,
8
+ FileSource,
9
+ )
10
+ from config.settings import settings
11
+
12
+ class TranscriptionService:
13
+ def __init__(self):
14
+ if not settings.DEEPGRAM_API_KEY:
15
+ raise ValueError("DEEPGRAM_API_KEY is not set in environment variables")
16
+
17
+ # Initialize Deepgram client with options
18
+ config = DeepgramClientOptions(
19
+ verbose=False, # Set to True for debugging
20
+ )
21
+ self.client = DeepgramClient(settings.DEEPGRAM_API_KEY, config)
22
+
23
+ def transcribe_file(self, audio_path: str | Path) -> str:
24
+ """
25
+ Transcribe an audio file using Deepgram.
26
+
27
+ Args:
28
+ audio_path: Path to the audio file
29
+
30
+ Returns:
31
+ Transcribed text with proper formatting
32
+ """
33
+ try:
34
+ print(f"Transcribing audio file: {audio_path}")
35
+
36
+ # Read file into buffer
37
+ with open(audio_path, "rb") as file:
38
+ buffer_data = file.read()
39
+
40
+ # Create payload
41
+ payload: FileSource = {
42
+ "buffer": buffer_data,
43
+ }
44
+
45
+ # Configure transcription options
46
+ options = PrerecordedOptions(
47
+ model="nova-2",
48
+ smart_format=True,
49
+ language="en-US",
50
+ utterances=True,
51
+ punctuate=True,
52
+ diarize=True
53
+ )
54
+
55
+ # Transcribe with timeout
56
+ response = self.client.listen.rest.v("1").transcribe_file(
57
+ payload,
58
+ options,
59
+ timeout=httpx.Timeout(300.0, connect=10.0)
60
+ )
61
+
62
+ # Extract the transcript from the response
63
+ transcript = response.results.channels[0].alternatives[0].transcript
64
+
65
+ return transcript.strip()
66
+
67
+ except Exception as e:
68
+ raise Exception(f"Error transcribing with Deepgram: {str(e)}")
utils/youtube.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from youtube_transcript_api import YouTubeTranscriptApi
3
+ from urllib.parse import urlparse, parse_qs
4
+
5
+ def extract_youtube_video_id(url: str) -> str:
6
+ """Extract the video_id from a YouTube URL."""
7
+ parsed = urlparse(url)
8
+ if parsed.hostname in ('www.youtube.com', 'youtube.com'):
9
+ return parse_qs(parsed.query)['v'][0]
10
+ elif parsed.hostname == 'youtu.be':
11
+ return parsed.path.lstrip('/')
12
+ raise ValueError("Invalid YouTube URL")
13
+
14
+ def download_transcript(url: str) -> str:
15
+ """Download the YouTube transcript as a string."""
16
+ video_id = extract_youtube_video_id(url)
17
+ print(f"Downloading transcript for video ID: {video_id}")
18
+ transcript_list = YouTubeTranscriptApi.get_transcript(video_id)
19
+ full_text = " ".join([item['text'] for item in transcript_list])
20
+ return full_text