Spaces:
Running
Running
Benjamin Consolvo
commited on
Commit
·
7ed8641
1
Parent(s):
efb324b
first app commit
Browse files- Dockerfile +0 -21
- OAI_CONFIG_LIST.json +14 -0
- README.md +146 -13
- app.py +551 -0
- intelpreventativehealthcare.py +649 -0
- pyproject.toml +20 -0
- requirements.txt +10 -1
- src/streamlit_app.py +0 -40
Dockerfile
DELETED
@@ -1,21 +0,0 @@
|
|
1 |
-
FROM python:3.9-slim
|
2 |
-
|
3 |
-
WORKDIR /app
|
4 |
-
|
5 |
-
RUN apt-get update && apt-get install -y \
|
6 |
-
build-essential \
|
7 |
-
curl \
|
8 |
-
software-properties-common \
|
9 |
-
git \
|
10 |
-
&& rm -rf /var/lib/apt/lists/*
|
11 |
-
|
12 |
-
COPY requirements.txt ./
|
13 |
-
COPY src/ ./src/
|
14 |
-
|
15 |
-
RUN pip3 install -r requirements.txt
|
16 |
-
|
17 |
-
EXPOSE 8501
|
18 |
-
|
19 |
-
HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
|
20 |
-
|
21 |
-
ENTRYPOINT ["streamlit", "run", "src/streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
OAI_CONFIG_LIST.json
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[
|
2 |
+
{
|
3 |
+
"model": "meta-llama/Llama-3.3-70B-Instruct",
|
4 |
+
"base_url": "https://api.inference.denvrdata.com/v1/",
|
5 |
+
"api_key": "",
|
6 |
+
"price": [0.0, 0.0]
|
7 |
+
},
|
8 |
+
{
|
9 |
+
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-70B",
|
10 |
+
"base_url": "https://api.inference.denvrdata.com/v1/",
|
11 |
+
"api_key": "",
|
12 |
+
"price": [0.0, 0.0]
|
13 |
+
}
|
14 |
+
]
|
README.md
CHANGED
@@ -1,20 +1,153 @@
|
|
1 |
---
|
2 |
-
title: Preventative Healthcare
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
-
sdk:
|
7 |
-
|
8 |
-
|
9 |
-
- streamlit
|
10 |
pinned: false
|
11 |
-
short_description: Streamlit template space
|
12 |
license: apache-2.0
|
|
|
13 |
---
|
|
|
14 |
|
15 |
-
|
16 |
|
17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
|
20 |
-
forums](https://discuss.streamlit.io).
|
|
|
1 |
---
|
2 |
+
title: Preventative Healthcare with AutoGen
|
3 |
+
emoji: 🔥
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: purple
|
6 |
+
sdk: streamlit
|
7 |
+
sdk_version: 1.42.2
|
8 |
+
app_file: app.py
|
|
|
9 |
pinned: false
|
|
|
10 |
license: apache-2.0
|
11 |
+
short_description: Using AI agents for preventative healthcare maintenance
|
12 |
---
|
13 |
+
[//]: <Add samples here https://github.com/microsoft/autogen/tree/main/python/samples>
|
14 |
|
15 |
+
## AutoGen Multi-Agent Chat Preventative Healthcare
|
16 |
|
17 |
+
This is a multi-agent system built on top of AutoGen agents designed to automate and optimize preventative healthcare outreach. It uses multiple agents, large language models (LLMs) and asynchronous programming to streamline the process of identifying patients who meet specific screening criteria, filter patient data, and generate personalized outreach emails.
|
18 |
+
|
19 |
+
The system uses model endpoints hosted by [Denvr Dataworks](https://www.denvrdata.com/intel) on Intel® Gaudi® accelerators, and an OpenAI-compatible API key.
|
20 |
+
|
21 |
+
Credit: Though heavily modified, the original idea comes from Mike Lynch on his [Medium blog](https://medium.com/@micklynch_6905/hospitalgpt-managing-a-patient-population-with-autogen-powered-by-gpt-4-mixtral-8x7b-ef9f54f275f1).
|
22 |
+
|
23 |
+
### Workflow:
|
24 |
+
|
25 |
+
<p align="center">
|
26 |
+
<img width="700" src="images/prev_healthcare_4.drawio.svg">
|
27 |
+
</p>
|
28 |
+
|
29 |
+
1. **Define Screening Criteria**: After getting the general screening task from the user, the User Proxy Agent starts a conversation between the Epidemiologist Agent and the Doctor Critic Agent to define the criteria for patient outreach based on the target screening type. The output criteria is age range (e.g., 40–70), gender, and relevant medical history.
|
30 |
+
|
31 |
+
2. **Filter Patients**: The Data Analyst Agent filters patient data from a CSV file based on the defined criteria, including age range, gender, and medical conditions. The patient data are synthetically generated. You can find the sample data under [data/patients.csv](data/patients.csv).
|
32 |
+
|
33 |
+
3. **Generate Outreach Emails**: The program generates outreach emails for the filtered patients using LLMs and saves them as text files.
|
34 |
+
|
35 |
+
### Setup
|
36 |
+
|
37 |
+
If you want a local copy of the application to run, you can clone the repository and then navigate into the folder with:
|
38 |
+
|
39 |
+
```bash
|
40 |
+
git clone https://huggingface.co/spaces/Intel/preventative_healthcare
|
41 |
+
cd preventative_healthcare
|
42 |
+
```
|
43 |
+
|
44 |
+
|
45 |
+
You can use the `uv` package to manage your virtual environment and dependencies. Just initialize the `uv` project, and create the virtual environment:
|
46 |
+
|
47 |
+
```bash
|
48 |
+
uv init
|
49 |
+
uv venv
|
50 |
+
```
|
51 |
+
|
52 |
+
Activate the virtual environment
|
53 |
+
```bash
|
54 |
+
source .venv/bin/activate
|
55 |
+
```
|
56 |
+
|
57 |
+
Install dependencies:
|
58 |
+
```bash
|
59 |
+
uv sync
|
60 |
+
```
|
61 |
+
|
62 |
+
To deactivate the virtual environment when finished running the application:
|
63 |
+
```bash
|
64 |
+
deactivate
|
65 |
+
```
|
66 |
+
|
67 |
+
### OpenAI API Key, Model Name, and Endpoint URL
|
68 |
+
|
69 |
+
1. Add your OpenAI-compatible API key to the `OAI_CONFIG_LIST.json` file.
|
70 |
+
2. Modify the `model` and `base_url` to the model name and endpoint URL that you are using. The `OAI_CONFIG_LIST.json` should look like:
|
71 |
+
```json
|
72 |
+
[
|
73 |
+
{
|
74 |
+
"model": "meta-llama/Llama-3.3-70B-Instruct",
|
75 |
+
"base_url": "https://api.inference.denvrdata.com/v1/",
|
76 |
+
"api_key": "",
|
77 |
+
"price": [0.0, 0.0]
|
78 |
+
},
|
79 |
+
{
|
80 |
+
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-70B",
|
81 |
+
"base_url": "https://api.inference.denvrdata.com/v1/",
|
82 |
+
"api_key": "",
|
83 |
+
"price": [0.0, 0.0]
|
84 |
+
}
|
85 |
+
]
|
86 |
+
```
|
87 |
+
|
88 |
+
### Modifying prompts
|
89 |
+
To modify prompts, you can edit them in the UI, or you can edit the following files:
|
90 |
+
|
91 |
+
1. User proxy agent: the agent responsible for passing along the user's preventative healthcare task to the other agents.
|
92 |
+
[prompts/user_proxy_prompt.py](prompts/user_proxy_prompt.py)
|
93 |
+
2. Epidemiologist agent: The disease specialist agent who will gather the preventative healthcare task and decide on patient criteria.
|
94 |
+
[prompts/epidemiologist_prompt.py](prompts/epidemiologist_prompt.py)
|
95 |
+
3. Doctor Critic agent: The doctor critic agent reviews the criteria from the epidemiologist and passes this along. The output will be used to filter actual patients from the patient data.
|
96 |
+
[prompts/doctor_critic_prompt.py](prompts/doctor_critic_prompt.py)
|
97 |
+
4. Outreach email: This is not an agent, but still uses an LLM to build the outreach email.
|
98 |
+
[prompts/outreach_email_prompt.py](prompts/outreach_email_prompt.py)
|
99 |
+
|
100 |
+
### Example Usage
|
101 |
+
|
102 |
+
If you want to run the app with streamlit, you can run using:
|
103 |
+
|
104 |
+
```bash
|
105 |
+
streamlit run app.py
|
106 |
+
```
|
107 |
+
|
108 |
+
If you want to just run the script locally from the command line, use the following command:
|
109 |
+
|
110 |
+
```bash
|
111 |
+
python intelpreventativehealthcare.py \
|
112 |
+
--oai_config "OAI_CONFIG_LIST.json" \
|
113 |
+
--target_screening "Type 2 Diabetes" \
|
114 |
+
--patients_file "data/patients.csv" \
|
115 |
+
--phone "123-456-7890" \
|
116 |
+
--email "[email protected]" \
|
117 |
+
--name "Benjamin Consolvo"
|
118 |
+
```
|
119 |
+
|
120 |
+
The arguments are defined as follows:
|
121 |
+
|
122 |
+
- `--oai_config`: Path to the `OAI_CONFIG_LIST.json` file, which contains the model endpoints, model name, and api key.
|
123 |
+
- `--target_screening`: The type of screening task (e.g., "Type 2 Diabetes screening").
|
124 |
+
- `--patients_file`: Path to the CSV file containing patient data. Default is `data/patients.csv`.
|
125 |
+
- `--phone`: Phone number to include in the outreach emails. Default is `123-456-7890`.
|
126 |
+
- `--email`: Reply email address to include in the outreach emails. Default is `[email protected]`.
|
127 |
+
- `--name`: Name to include in the outreach emails. Default is `Benjamin Consolvo`.
|
128 |
+
|
129 |
+
This will process the patient data, filter based on the specified criteria, and generate outreach emails for the patients. The emails will be saved as text files in the `data/` directory.
|
130 |
+
|
131 |
+
### 6 Lessons Learned
|
132 |
+
|
133 |
+
1. Some LLMs perform better than others at certain tasks. While this may seem obvious, in practice, you often need to adjust which LLMs you use after seeing the results. In my case, I found that [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) model was much more consistent and hallucinated less than [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) for email generation.
|
134 |
+
2. Setting temperature to 0 is important for getting a consistent output response from LLMs. In my use-case, I ended up setting this creativity level to 0 across all models.
|
135 |
+
3. Prompt engineering is very important in the age of instructing LLMs on what to do. My top 3 tips:
|
136 |
+
- Be specific and detailed
|
137 |
+
- Give exact output format examples
|
138 |
+
- Tell the LLM what to do, rather than telling it everything it should not do
|
139 |
+
|
140 |
+
You can read more about prompt engineering on [OpenAI's blog here](https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api)
|
141 |
+
|
142 |
+
4. Certain tasks are easier to manage with traditional programming rather than building an agent to do it. In the case of getting data consistently from a database with a specified format, write a function rather than building an agent. The LLM may hallucinate and not carry out the task correctly. I implemented this after fighting with the agents in a function called `get_patients_from_criteria`. When I started this project, the LLMs were inventing data that were not a part of the database, even though I clearly instructed the agent to only use data from the database! To resolve this, I made sure that the agent was using a specific function to read from the database with a tool-call.
|
143 |
+
5. Do operations asynchronously wherever possible. Instead of writing emails one by one in a for loop, write them all at once with `async`.
|
144 |
+
6. Code writing tools like GitHub Copilot, Cursor, and Windsurf can save a lot of time, but you still need to pay attention to the output and understand what is going on with the code. A lot of unecessary lines of code and technical debt will be accumulated by relying purely on code generation tools.
|
145 |
+
|
146 |
+
### Follow Up
|
147 |
+
|
148 |
+
Get your own OpenAI-compatible API key and connect your agents to LLMs on Intel® Gaudi® accelerators with just an endpoint, courtesy of cloud-provider Denvr Dataworks: https://www.denvrdata.com/intel
|
149 |
+
|
150 |
+
Chat with 6K+ fellow developers on the Intel DevHub Discord: https://discord.gg/kfJ3NKEw5t
|
151 |
+
|
152 |
+
Connect with me on LinkedIn: https://linkedin.com/in/bconsolvo
|
153 |
|
|
|
|
app.py
ADDED
@@ -0,0 +1,551 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import streamlit as st
|
2 |
+
import pandas as pd
|
3 |
+
import asyncio
|
4 |
+
import io
|
5 |
+
import contextlib
|
6 |
+
import os
|
7 |
+
from pathlib import Path
|
8 |
+
from intelpreventativehealthcare import (
|
9 |
+
target_patients_outreach,
|
10 |
+
find_patients,
|
11 |
+
write_outreach_emails,
|
12 |
+
get_configs,
|
13 |
+
)
|
14 |
+
# Import the prompt templates
|
15 |
+
from intelpreventativehealthcare import (
|
16 |
+
USER_PROXY_PROMPT,
|
17 |
+
EPIDEMIOLOGIST_PROMPT,
|
18 |
+
DOCTOR_CRITIC_PROMPT,
|
19 |
+
OUTREACH_EMAIL_PROMPT_TEMPLATE,
|
20 |
+
)
|
21 |
+
from openai import OpenAI
|
22 |
+
import streamlit.components.v1 as components # Add this import for custom HTML
|
23 |
+
|
24 |
+
# Streamlit app configuration
|
25 |
+
st.set_page_config(page_title="Preventative Healthcare Outreach", layout="wide")
|
26 |
+
|
27 |
+
# Title at the top of the app
|
28 |
+
st.title("Preventative Healthcare Outreach")
|
29 |
+
st.markdown("""
|
30 |
+
Visit the README page below to learn how the agentic system works. The system uses AI agents to generate outreach criteria, filter patients, and ultimately write outreach emails. To get the agents working, you can follow these steps:
|
31 |
+
1. Optionally, customize the prompts of the agents, or just use the default ones to get started.
|
32 |
+
2. Select default patient data, or upload your own CSV file.
|
33 |
+
3. Describe a medical screening task.
|
34 |
+
4. Click on "Generate Outreach Emails" to create draft emails to patients (.txt files with email drafts).
|
35 |
+
""")
|
36 |
+
|
37 |
+
# Function to read README.md file
|
38 |
+
def read_readme():
|
39 |
+
readme_path = Path(__file__).parent / "README.md"
|
40 |
+
|
41 |
+
if readme_path.exists():
|
42 |
+
with open(readme_path, 'r') as f:
|
43 |
+
readme_content = f.read()
|
44 |
+
return readme_content
|
45 |
+
else:
|
46 |
+
return "README.md file not found in the project directory."
|
47 |
+
|
48 |
+
# Function to embed SVG images directly into the markdown content
|
49 |
+
def fix_svg_images_in_markdown(markdown_content):
|
50 |
+
import re
|
51 |
+
|
52 |
+
# Find SVG image tags in the markdown content
|
53 |
+
svg_pattern = r'<img[^>]*src="([^"]*\.svg)"[^>]*>'
|
54 |
+
|
55 |
+
def replace_with_embedded_svg(match):
|
56 |
+
img_tag = match.group(0)
|
57 |
+
src_match = re.search(r'src="([^"]*)"', img_tag)
|
58 |
+
if not src_match:
|
59 |
+
return img_tag
|
60 |
+
|
61 |
+
src_path = src_match.group(1)
|
62 |
+
width_match = re.search(r'width="([^"]*)"', img_tag)
|
63 |
+
width = width_match.group(1) if width_match else "100%"
|
64 |
+
|
65 |
+
# Construct full path to the image
|
66 |
+
img_path = Path(__file__).parent / src_path
|
67 |
+
|
68 |
+
if img_path.exists():
|
69 |
+
try:
|
70 |
+
# Read SVG content directly
|
71 |
+
with open(img_path, 'r') as f:
|
72 |
+
svg_content = f.read()
|
73 |
+
|
74 |
+
# Create a custom HTML component for the SVG with proper styling
|
75 |
+
return f"""<div style="text-align:center; margin:20px 0;">
|
76 |
+
<div style="max-width:{width}px; margin:0 auto;">
|
77 |
+
{svg_content}
|
78 |
+
</div>
|
79 |
+
</div>"""
|
80 |
+
except Exception as e:
|
81 |
+
return f"""<div style="text-align:center; color:red; padding:10px;">
|
82 |
+
Error loading SVG image: {e}
|
83 |
+
</div>"""
|
84 |
+
else:
|
85 |
+
return f"""<div style="text-align:center; color:red; padding:10px;">
|
86 |
+
Image not found: {src_path}
|
87 |
+
</div>"""
|
88 |
+
|
89 |
+
# Replace all SVG image tags with embedded SVG content
|
90 |
+
return re.sub(svg_pattern, replace_with_embedded_svg, markdown_content)
|
91 |
+
|
92 |
+
# Create tabs
|
93 |
+
tab1, tab2 = st.tabs(["Healthcare Outreach App", "README"])
|
94 |
+
|
95 |
+
# Initialize session state for prompts if not already present
|
96 |
+
if 'user_proxy_prompt' not in st.session_state:
|
97 |
+
st.session_state.user_proxy_prompt = USER_PROXY_PROMPT
|
98 |
+
if 'epidemiologist_prompt' not in st.session_state:
|
99 |
+
st.session_state.epidemiologist_prompt = EPIDEMIOLOGIST_PROMPT
|
100 |
+
if 'doctor_critic_prompt' not in st.session_state:
|
101 |
+
st.session_state.doctor_critic_prompt = DOCTOR_CRITIC_PROMPT
|
102 |
+
if 'outreach_email_prompt' not in st.session_state:
|
103 |
+
st.session_state.outreach_email_prompt = OUTREACH_EMAIL_PROMPT_TEMPLATE
|
104 |
+
|
105 |
+
# Main Healthcare App Tab (Tab 1)
|
106 |
+
with tab1:
|
107 |
+
# --- Activity/log screen for agent communication ---
|
108 |
+
st.markdown("### Activity Log")
|
109 |
+
# Create a container with fixed height and scrollbar for logs
|
110 |
+
log_container = st.container()
|
111 |
+
with log_container:
|
112 |
+
# Use an expander that's open by default to contain the log
|
113 |
+
with st.expander("Real-time Log", expanded=True):
|
114 |
+
log_placeholder = st.empty()
|
115 |
+
|
116 |
+
# --- Move user inputs, instructions, and CSV column info to sidebar ---
|
117 |
+
with st.sidebar:
|
118 |
+
# Add a section for customizing prompts at the top of the sidebar
|
119 |
+
st.markdown("### Customize Agent Prompts")
|
120 |
+
st.caption("The agents use LLMs and natural language understanding (NLU) to organize the tasks they need to accomplish. You can modify the prompts for each agent below; these prompts are given to the agents so that they can work together to produce the final outreach emails for the preventative healthcare task at hand.")
|
121 |
+
|
122 |
+
# User Proxy Prompt
|
123 |
+
with st.expander("User Proxy Prompt"):
|
124 |
+
user_prompt = st.text_area(
|
125 |
+
"User Proxy Prompt",
|
126 |
+
value=st.session_state.user_proxy_prompt,
|
127 |
+
height=300,
|
128 |
+
key="user_proxy_input",
|
129 |
+
label_visibility="hidden",
|
130 |
+
# Add these style properties to preserve whitespace formatting
|
131 |
+
help="",
|
132 |
+
placeholder="",
|
133 |
+
disabled=False,
|
134 |
+
# Use CSS to preserve whitespace formatting
|
135 |
+
max_chars=None
|
136 |
+
)
|
137 |
+
st.session_state.user_proxy_prompt = user_prompt
|
138 |
+
|
139 |
+
# Epidemiologist Prompt
|
140 |
+
with st.expander("Epidemiologist Prompt"):
|
141 |
+
epi_prompt = st.text_area(
|
142 |
+
"Epidemiologist Prompt",
|
143 |
+
value=st.session_state.epidemiologist_prompt,
|
144 |
+
height=300,
|
145 |
+
key="epidemiologist_input",
|
146 |
+
label_visibility="hidden",
|
147 |
+
help="",
|
148 |
+
placeholder="",
|
149 |
+
disabled=False,
|
150 |
+
max_chars=None
|
151 |
+
)
|
152 |
+
st.session_state.epidemiologist_prompt = epi_prompt
|
153 |
+
|
154 |
+
# Doctor Critic Prompt
|
155 |
+
with st.expander("Doctor Critic Prompt"):
|
156 |
+
doc_prompt = st.text_area(
|
157 |
+
"Doctor Critic Prompt",
|
158 |
+
value=st.session_state.doctor_critic_prompt,
|
159 |
+
height=300,
|
160 |
+
key="doctor_critic_input",
|
161 |
+
label_visibility="hidden",
|
162 |
+
help="",
|
163 |
+
placeholder="",
|
164 |
+
disabled=False,
|
165 |
+
max_chars=None
|
166 |
+
)
|
167 |
+
st.session_state.doctor_critic_prompt = doc_prompt
|
168 |
+
|
169 |
+
# Outreach Email Prompt Template
|
170 |
+
with st.expander("Email Template Prompt"):
|
171 |
+
email_prompt = st.text_area(
|
172 |
+
"Email Template Prompt",
|
173 |
+
value=st.session_state.outreach_email_prompt,
|
174 |
+
height=300,
|
175 |
+
key="email_template_input",
|
176 |
+
label_visibility="hidden",
|
177 |
+
help="",
|
178 |
+
placeholder="",
|
179 |
+
disabled=False,
|
180 |
+
max_chars=None
|
181 |
+
)
|
182 |
+
st.session_state.outreach_email_prompt = email_prompt
|
183 |
+
|
184 |
+
# Add custom CSS to preserve whitespace in text areas while ensuring content fits
|
185 |
+
st.markdown("""
|
186 |
+
<style>
|
187 |
+
.stTextArea textarea {
|
188 |
+
font-family: monospace;
|
189 |
+
white-space: pre-wrap !important; /* Use pre-wrap to preserve whitespace but allow wrapping */
|
190 |
+
word-wrap: break-word !important; /* Ensure words break to next line if needed */
|
191 |
+
line-height: 1.4;
|
192 |
+
tab-size: 2; /* Reduce tab size to save space */
|
193 |
+
padding: 8px;
|
194 |
+
font-size: 0.9em; /* Slightly smaller font to fit more content */
|
195 |
+
}
|
196 |
+
</style>
|
197 |
+
""", unsafe_allow_html=True)
|
198 |
+
|
199 |
+
# Reset prompts button
|
200 |
+
if st.button("Reset Prompts to Default"):
|
201 |
+
st.session_state.user_proxy_prompt = USER_PROXY_PROMPT
|
202 |
+
st.session_state.epidemiologist_prompt = EPIDEMIOLOGIST_PROMPT
|
203 |
+
st.session_state.doctor_critic_prompt = DOCTOR_CRITIC_PROMPT
|
204 |
+
st.session_state.outreach_email_prompt = OUTREACH_EMAIL_PROMPT_TEMPLATE
|
205 |
+
st.rerun()
|
206 |
+
|
207 |
+
st.markdown("---")
|
208 |
+
|
209 |
+
# Now add the "Get started" section after the prompts
|
210 |
+
st.header("Patient Data and Screening Task")
|
211 |
+
|
212 |
+
st.caption("Required CSV columns: patient_id, First Name, Last Name, Email, Patient diagnosis summary, age, gender, condition")
|
213 |
+
|
214 |
+
# Create a container for the default dataset option to control its appearance
|
215 |
+
default_dataset_container = st.container()
|
216 |
+
|
217 |
+
# Add the file upload option after the default dataset option
|
218 |
+
uploaded_file = st.file_uploader("Upload your own CSV file with patient data", type=["csv"])
|
219 |
+
|
220 |
+
# If a file is uploaded, show a message and disable the default checkbox
|
221 |
+
if uploaded_file is not None:
|
222 |
+
# Visual indication that custom data is being used
|
223 |
+
st.success("✅ Using your uploaded file")
|
224 |
+
|
225 |
+
# Disable the default dataset option with clear visual feedback
|
226 |
+
with default_dataset_container:
|
227 |
+
st.markdown("""
|
228 |
+
<div style="opacity: 0.5; pointer-events: none;">
|
229 |
+
<input type="checkbox" disabled> Use default dataset (data/patients.csv)
|
230 |
+
<div style="font-size: 0.8em; color: #999; font-style: italic;">
|
231 |
+
Disabled because custom file is uploaded
|
232 |
+
</div>
|
233 |
+
</div>
|
234 |
+
""", unsafe_allow_html=True)
|
235 |
+
|
236 |
+
# Set use_default to False when a file is uploaded
|
237 |
+
use_default = False
|
238 |
+
else:
|
239 |
+
# No file uploaded, show normal checkbox
|
240 |
+
with default_dataset_container:
|
241 |
+
use_default = st.checkbox("Use default dataset (data/patients.csv)", value=True)
|
242 |
+
|
243 |
+
screening_task = st.text_input("Enter the medical screening task (e.g., 'Colonoscopy screening')", "")
|
244 |
+
|
245 |
+
# Add contact information section
|
246 |
+
st.markdown("---")
|
247 |
+
st.subheader("Healthcare Provider Contact Information")
|
248 |
+
st.caption("This information will appear in the emails sent to patients")
|
249 |
+
|
250 |
+
# Create three columns for contact info fields
|
251 |
+
col1, col2, col3 = st.columns(3)
|
252 |
+
|
253 |
+
with col1:
|
254 |
+
provider_name = st.text_input("Provider Name", "Benjamin Consolvo")
|
255 |
+
|
256 |
+
with col2:
|
257 |
+
provider_email = st.text_input("Provider Email", "[email protected]")
|
258 |
+
|
259 |
+
with col3:
|
260 |
+
provider_phone = st.text_input("Provider Phone", "123-456-7890")
|
261 |
+
|
262 |
+
# Validate input fields before enabling the button
|
263 |
+
required_fields_empty = (
|
264 |
+
screening_task.strip() == "" or
|
265 |
+
provider_name.strip() == "" or
|
266 |
+
provider_email.strip() == "" or
|
267 |
+
provider_phone.strip() == ""
|
268 |
+
)
|
269 |
+
|
270 |
+
if required_fields_empty:
|
271 |
+
st.warning("Please fill in all required fields before proceeding.")
|
272 |
+
st.markdown("---")
|
273 |
+
# Move the button to the sidebar - disabled if required fields are empty
|
274 |
+
generate = st.button("Generate Outreach Emails", disabled=required_fields_empty)
|
275 |
+
|
276 |
+
# Explicitly set environment variable to avoid TTY errors
|
277 |
+
os.environ["PYTHONUNBUFFERED"] = "1"
|
278 |
+
|
279 |
+
# Only run the generation logic if we're on the first tab
|
280 |
+
if tab1._active and generate:
|
281 |
+
# Since the button can only be clicked when all fields are filled,
|
282 |
+
# we don't need additional validation here
|
283 |
+
|
284 |
+
# Hugging Face secrets
|
285 |
+
api_key = st.secrets["OPENAI_API_KEY"]
|
286 |
+
base_url = st.secrets["OPENAI_BASE_URL"]
|
287 |
+
|
288 |
+
# --- Initialize log ---
|
289 |
+
log_messages = []
|
290 |
+
def log(msg):
|
291 |
+
log_messages.append(msg)
|
292 |
+
# Show all messages in the scrollable container with better contrast
|
293 |
+
log_placeholder.markdown(
|
294 |
+
f"""
|
295 |
+
<div style="height: 400px; overflow-y: auto; border: 1px solid #cccccc;
|
296 |
+
padding: 15px; border-radius: 5px; background-color: rgba(240, 242, 246, 0.4);
|
297 |
+
color: inherit; font-family: monospace;">
|
298 |
+
{"<br>".join(log_messages)}
|
299 |
+
</div>
|
300 |
+
""",
|
301 |
+
unsafe_allow_html=True
|
302 |
+
)
|
303 |
+
|
304 |
+
# Capture stdout/stderr during the workflow
|
305 |
+
stdout_buffer = io.StringIO()
|
306 |
+
stderr_buffer = io.StringIO()
|
307 |
+
with contextlib.redirect_stdout(stdout_buffer), contextlib.redirect_stderr(stderr_buffer):
|
308 |
+
if not screening_task:
|
309 |
+
st.error("Please enter a medical screening task.")
|
310 |
+
elif not uploaded_file and not use_default:
|
311 |
+
st.error("Please upload a CSV file or select the default dataset.")
|
312 |
+
else:
|
313 |
+
# Load patient data
|
314 |
+
if uploaded_file:
|
315 |
+
patients_file = uploaded_file
|
316 |
+
else:
|
317 |
+
# Use absolute path for default dataset
|
318 |
+
patients_file = os.path.join(os.path.dirname(__file__), "data/patients.csv")
|
319 |
+
|
320 |
+
try:
|
321 |
+
patients_df = pd.read_csv(patients_file)
|
322 |
+
except Exception as e:
|
323 |
+
st.error(f"Error reading the CSV file: {e}")
|
324 |
+
st.stop()
|
325 |
+
|
326 |
+
# Validate required columns
|
327 |
+
required_columns = [
|
328 |
+
'patient_id', 'First Name', 'Last Name', 'Email',
|
329 |
+
'Patient diagnosis summary', 'age', 'gender', 'condition'
|
330 |
+
]
|
331 |
+
if not all(col in patients_df.columns for col in required_columns):
|
332 |
+
st.error(f"The uploaded CSV file is missing required columns: {required_columns}")
|
333 |
+
st.stop()
|
334 |
+
|
335 |
+
# Load configurations
|
336 |
+
llama_filter_dict = {"model": ["meta-llama/Llama-3.3-70B-Instruct"]}
|
337 |
+
deepseek_filter_dict = {"model": ["deepseek-ai/DeepSeek-R1-Distill-Llama-70B"]}
|
338 |
+
config_list_llama = get_configs("OAI_CONFIG_LIST.json", llama_filter_dict)
|
339 |
+
config_list_deepseek = get_configs("OAI_CONFIG_LIST.json", deepseek_filter_dict)
|
340 |
+
|
341 |
+
# Ensure the API key from secrets is used
|
342 |
+
for config in config_list_llama:
|
343 |
+
config["api_key"] = api_key
|
344 |
+
for config in config_list_deepseek:
|
345 |
+
config["api_key"] = api_key
|
346 |
+
|
347 |
+
# --- Log agent communication ---
|
348 |
+
log("🟢 <b>Starting agent workflow...</b>")
|
349 |
+
log("🧑⚕️ <b>Screening task:</b> " + screening_task)
|
350 |
+
log("📄 <b>Loaded patient data:</b> {} records".format(len(patients_df)))
|
351 |
+
|
352 |
+
# Generate criteria for outreach - Pass the custom prompts
|
353 |
+
log("🤖 <b>Agent (Llama):</b> Generating outreach criteria...")
|
354 |
+
criteria = asyncio.run(target_patients_outreach(
|
355 |
+
screening_task, config_list_llama, config_list_deepseek,
|
356 |
+
log_fn=log if "log_fn" in target_patients_outreach.__code__.co_varnames else None,
|
357 |
+
user_proxy_prompt=st.session_state.user_proxy_prompt,
|
358 |
+
epidemiologist_prompt=st.session_state.epidemiologist_prompt,
|
359 |
+
doctor_critic_prompt=st.session_state.doctor_critic_prompt
|
360 |
+
))
|
361 |
+
log("✅ <b>Criteria generated.</b>")
|
362 |
+
|
363 |
+
# Find patients matching criteria
|
364 |
+
log("🤖 <b>Agent (Llama):</b> Filtering patients based on criteria...")
|
365 |
+
filtered_patients, arguments_criteria = asyncio.run(find_patients(
|
366 |
+
criteria, config_list_llama,
|
367 |
+
log_fn=log if "log_fn" in find_patients.__code__.co_varnames else None,
|
368 |
+
patients_file_path=patients_file # Use correct parameter name: patients_file_path
|
369 |
+
))
|
370 |
+
log("✅ <b>Patients filtered.</b>")
|
371 |
+
|
372 |
+
if filtered_patients.empty:
|
373 |
+
log("⚠️ <b>No patients matched the criteria.</b>")
|
374 |
+
st.warning("No patients matched the criteria.")
|
375 |
+
else:
|
376 |
+
# Initialize OpenAI client
|
377 |
+
openai_client = OpenAI(api_key=api_key, base_url=base_url)
|
378 |
+
|
379 |
+
# Generate outreach emails - Pass the custom email template
|
380 |
+
log("🤖 <b>Agent (Llama):</b> Generating outreach emails...")
|
381 |
+
asyncio.run(write_outreach_emails(
|
382 |
+
filtered_patients,
|
383 |
+
screening_task,
|
384 |
+
arguments_criteria,
|
385 |
+
openai_client,
|
386 |
+
config_list_llama[0]['model'],
|
387 |
+
phone=provider_phone, # Pass the provider's phone from form
|
388 |
+
email=provider_email, # Pass the provider's email from form
|
389 |
+
name=provider_name, # Pass the provider's name from form
|
390 |
+
log_fn=log if "log_fn" in write_outreach_emails.__code__.co_varnames else None,
|
391 |
+
outreach_email_prompt_template=st.session_state.outreach_email_prompt
|
392 |
+
))
|
393 |
+
|
394 |
+
# Make sure data directory exists (for Hugging Face Spaces)
|
395 |
+
data_dir = os.path.join(os.path.dirname(__file__), "data")
|
396 |
+
os.makedirs(data_dir, exist_ok=True)
|
397 |
+
|
398 |
+
# Generate expected email filenames based on filtered patients
|
399 |
+
expected_email_files = []
|
400 |
+
for _, patient in filtered_patients.iterrows():
|
401 |
+
# Construct the expected filename based on patient data
|
402 |
+
firstname = patient['First Name']
|
403 |
+
lastname = patient['Last Name']
|
404 |
+
filename = f"{firstname}_{lastname}_email.txt"
|
405 |
+
if os.path.exists(os.path.join(data_dir, filename)):
|
406 |
+
expected_email_files.append(filename)
|
407 |
+
|
408 |
+
# Use only the email files for patients in the filtered DataFrame
|
409 |
+
email_files = expected_email_files
|
410 |
+
|
411 |
+
if email_files:
|
412 |
+
log("✅ <b>Outreach emails generated successfully:</b> {} emails created".format(len(email_files)))
|
413 |
+
st.success(f"{len(email_files)} outreach emails have been generated!")
|
414 |
+
|
415 |
+
# Create a section for downloads
|
416 |
+
st.markdown("### Download Generated Emails")
|
417 |
+
|
418 |
+
# Store email content in session state to persist across interactions
|
419 |
+
if 'email_contents' not in st.session_state:
|
420 |
+
st.session_state.email_contents = {}
|
421 |
+
for email_file in email_files:
|
422 |
+
with open(os.path.join(data_dir, email_file), 'r') as f:
|
423 |
+
st.session_state.email_contents[email_file] = f.read()
|
424 |
+
|
425 |
+
# Create ZIP file only once and store in session state
|
426 |
+
if 'zip_buffer' not in st.session_state:
|
427 |
+
import zipfile
|
428 |
+
zip_buffer = io.BytesIO()
|
429 |
+
with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zip_file:
|
430 |
+
for email_file, content in st.session_state.email_contents.items():
|
431 |
+
zip_file.writestr(email_file, content)
|
432 |
+
st.session_state.zip_buffer = zip_buffer.getvalue()
|
433 |
+
|
434 |
+
# Create base64 encoding of zip file
|
435 |
+
import base64
|
436 |
+
b64_zip = base64.b64encode(st.session_state.zip_buffer).decode()
|
437 |
+
|
438 |
+
# Create HTML for ZIP download - Use components.html instead of st.markdown
|
439 |
+
zip_html = f"""
|
440 |
+
<div style="margin-bottom: 20px;">
|
441 |
+
<a href="data:application/zip;base64,{b64_zip}"
|
442 |
+
download="patient_emails.zip"
|
443 |
+
style="text-decoration: none; display: inline-block; padding: 12px 18px;
|
444 |
+
border: 1px solid #ddd; border-radius: 4px; background-color: #4CAF50;
|
445 |
+
color: white; font-size: 16px; font-weight: bold; text-align: center;">
|
446 |
+
📦 Download All Emails as ZIP
|
447 |
+
</a>
|
448 |
+
</div>
|
449 |
+
"""
|
450 |
+
|
451 |
+
# Use components.html instead of st.markdown for ZIP download
|
452 |
+
components.html(zip_html, height=70)
|
453 |
+
|
454 |
+
st.markdown("---")
|
455 |
+
st.markdown("#### Individual Email Downloads")
|
456 |
+
|
457 |
+
# Generate HTML for individual email downloads
|
458 |
+
individual_html = """
|
459 |
+
<div style="display: flex; flex-wrap: wrap; gap: 8px;">
|
460 |
+
"""
|
461 |
+
|
462 |
+
# Generate download links for all emails
|
463 |
+
for i, email_file in enumerate(email_files):
|
464 |
+
file_content = st.session_state.email_contents.get(email_file, "")
|
465 |
+
# Create a base64 encoded version of the file content
|
466 |
+
b64_content = base64.b64encode(file_content.encode()).decode()
|
467 |
+
|
468 |
+
# Extract a more complete display name (First + Last name)
|
469 |
+
name_parts = email_file.split('_')[:2] # Get first and last name parts
|
470 |
+
display_name = " ".join(name_parts) # Join with space to create "First Last"
|
471 |
+
|
472 |
+
# Add download link to HTML
|
473 |
+
individual_html += f"""
|
474 |
+
<a href="data:text/plain;base64,{b64_content}"
|
475 |
+
download="{email_file}"
|
476 |
+
style="text-decoration: none; display: inline-block; margin: 4px; padding: 8px 12px;
|
477 |
+
border: 1px solid #ddd; border-radius: 4px; background-color: #f0f2f6;
|
478 |
+
color: #262730; font-size: 14px; text-align: center; min-width: 120px;">
|
479 |
+
{display_name}
|
480 |
+
</a>
|
481 |
+
"""
|
482 |
+
|
483 |
+
individual_html += """
|
484 |
+
</div>
|
485 |
+
"""
|
486 |
+
|
487 |
+
# Use components.html for individual downloads - estimate height based on number of emails
|
488 |
+
# Increase height calculation to account for potentially longer names
|
489 |
+
components.html(individual_html, height=100 + (len(email_files) // 4) * 60)
|
490 |
+
|
491 |
+
else:
|
492 |
+
log("⚠️ <b>Email generation process completed but no email files were found.</b>")
|
493 |
+
st.warning("The email generation process completed but no email files were found in the data directory. This might indicate an issue with the email generation or file saving process.")
|
494 |
+
|
495 |
+
# After workflow, append captured output
|
496 |
+
std_output = stdout_buffer.getvalue()
|
497 |
+
std_error = stderr_buffer.getvalue()
|
498 |
+
|
499 |
+
if std_output:
|
500 |
+
log_messages.append("<b>Terminal Output:</b>")
|
501 |
+
for line in std_output.splitlines():
|
502 |
+
if line.strip(): # Skip empty lines
|
503 |
+
log_messages.append(line)
|
504 |
+
# Update the log display with all messages using better contrast
|
505 |
+
log_placeholder.markdown(
|
506 |
+
f"""
|
507 |
+
<div style="height: 400px; overflow-y: auto; border: 1px solid #cccccc;
|
508 |
+
padding: 15px; border-radius: 5px; background-color: rgba(240, 242, 246, 0.4);
|
509 |
+
color: inherit; font-family: monospace;">
|
510 |
+
{"<br>".join(log_messages)}
|
511 |
+
</div>
|
512 |
+
""",
|
513 |
+
unsafe_allow_html=True
|
514 |
+
)
|
515 |
+
|
516 |
+
if std_error:
|
517 |
+
log_messages.append("<b style='color:#ff6b6b;'>Terminal Error:</b>")
|
518 |
+
for line in std_error.splitlines():
|
519 |
+
if line.strip(): # Skip empty lines
|
520 |
+
log_messages.append(f"<span style='color:#ff6b6b;'>{line}</span>")
|
521 |
+
# Update the log display with all messages
|
522 |
+
log_placeholder.markdown(
|
523 |
+
f"""
|
524 |
+
<div style="height: 400px; overflow-y: auto; border: 1px solid #cccccc;
|
525 |
+
padding: 15px; border-radius: 5px; background-color: rgba(240, 242, 246, 0.4);
|
526 |
+
color: inherit; font-family: monospace;">
|
527 |
+
{"<br>".join(log_messages)}
|
528 |
+
</div>
|
529 |
+
""",
|
530 |
+
unsafe_allow_html=True
|
531 |
+
)
|
532 |
+
|
533 |
+
# README Tab (Tab 2)
|
534 |
+
with tab2:
|
535 |
+
readme_content = read_readme()
|
536 |
+
|
537 |
+
# Process the README content to properly handle SVG images
|
538 |
+
readme_with_embedded_svgs = fix_svg_images_in_markdown(readme_content)
|
539 |
+
|
540 |
+
# Use unsafe_allow_html=True to render HTML content properly
|
541 |
+
st.markdown(readme_with_embedded_svgs, unsafe_allow_html=True)
|
542 |
+
|
543 |
+
# Add CSS to ensure SVGs are responsive and display properly
|
544 |
+
st.markdown("""
|
545 |
+
<style>
|
546 |
+
svg {
|
547 |
+
max-width: 100%;
|
548 |
+
height: auto;
|
549 |
+
}
|
550 |
+
</style>
|
551 |
+
""", unsafe_allow_html=True)
|
intelpreventativehealthcare.py
ADDED
@@ -0,0 +1,649 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# The code is a simulation of a healthcare system that uses AI agents to manage patient outreach
|
2 |
+
# Author: Benjamin Consolvo
|
3 |
+
# Originally created in 2025
|
4 |
+
# Original code and idea from Mike Lynch on Medium here. Heavily modified.
|
5 |
+
# https://medium.com/@micklynch_6905/hospitalgpt-managing-a-patient-population-with-autogen-powered-by-gpt-4-mixtral-8x7b-ef9f54f275f1
|
6 |
+
# https://github.com/micklynch/hospitalgpt
|
7 |
+
|
8 |
+
import os
|
9 |
+
import asyncio
|
10 |
+
import pandas as pd
|
11 |
+
import json
|
12 |
+
import argparse
|
13 |
+
from typing import Callable, Dict, Any
|
14 |
+
from autogen import (
|
15 |
+
AssistantAgent,
|
16 |
+
UserProxyAgent,
|
17 |
+
config_list_from_json,
|
18 |
+
GroupChat,
|
19 |
+
GroupChatManager,
|
20 |
+
register_function,
|
21 |
+
)
|
22 |
+
from openai import OpenAI
|
23 |
+
from prompts.epidemiologist_prompt import EPIDEMIOLOGIST_PROMPT
|
24 |
+
from prompts.doctor_critic_prompt import DOCTOR_CRITIC_PROMPT
|
25 |
+
from prompts.user_proxy_prompt import USER_PROXY_PROMPT
|
26 |
+
from prompts.outreach_email_prompt import OUTREACH_EMAIL_PROMPT_TEMPLATE
|
27 |
+
import aiofiles # For asynchronous file writing
|
28 |
+
import functools # For wrapping synchronous functions in async
|
29 |
+
|
30 |
+
# Export the prompt variables for use in the app
|
31 |
+
__all__ = [
|
32 |
+
"get_configs", "target_patients_outreach", "find_patients",
|
33 |
+
"write_outreach_emails", "USER_PROXY_PROMPT", "EPIDEMIOLOGIST_PROMPT",
|
34 |
+
"DOCTOR_CRITIC_PROMPT", "OUTREACH_EMAIL_PROMPT_TEMPLATE"
|
35 |
+
]
|
36 |
+
|
37 |
+
def get_configs(
|
38 |
+
env_or_file: str,
|
39 |
+
filter_dict: Dict[str, Any]
|
40 |
+
) -> Dict[str, Any]:
|
41 |
+
"""
|
42 |
+
Load configuration from a JSON file.
|
43 |
+
|
44 |
+
Args:
|
45 |
+
env_or_file (str): Path to the JSON file or environment variable name.
|
46 |
+
filter_dict (Dict[str, Any]): Dictionary to filter the configuration file.
|
47 |
+
|
48 |
+
Returns:
|
49 |
+
Dict[str, Any]: Filtered configuration dictionary.
|
50 |
+
"""
|
51 |
+
return config_list_from_json(env_or_file=env_or_file, filter_dict=filter_dict)
|
52 |
+
|
53 |
+
async def target_patients_outreach(
|
54 |
+
target_screening: str,
|
55 |
+
config_list_llama: Dict[str, Any],
|
56 |
+
config_list_deepseek: Dict[str, Any],
|
57 |
+
log_fn=None,
|
58 |
+
user_proxy_prompt=USER_PROXY_PROMPT,
|
59 |
+
epidemiologist_prompt=EPIDEMIOLOGIST_PROMPT,
|
60 |
+
doctor_critic_prompt=DOCTOR_CRITIC_PROMPT
|
61 |
+
) -> str:
|
62 |
+
"""
|
63 |
+
Determines the criteria for patient outreach based on a screening task.
|
64 |
+
|
65 |
+
This function facilitates a conversation between a user, an epidemiologist,
|
66 |
+
and a doctor critic to define the criteria for patient outreach. The output
|
67 |
+
criteria from the doctor and epidemiologist include minimum age, maximum age,
|
68 |
+
gender, and a possible previous condition.
|
69 |
+
|
70 |
+
Example:
|
71 |
+
|
72 |
+
criteria = asyncio.run(target_patients_outreach("Type 2 diabetes screening"))
|
73 |
+
|
74 |
+
Args:
|
75 |
+
target_screening (str): The type of screening task (e.g., "Type 2 diabetes screening").
|
76 |
+
config_list_llama (Dict[str, Any]): Configuration for the Llama model.
|
77 |
+
config_list_deepseek (Dict[str, Any]): Configuration for the Deepseek model.
|
78 |
+
log_fn (callable, optional): Function for logging messages.
|
79 |
+
user_proxy_prompt (str, optional): Custom prompt for the user proxy agent.
|
80 |
+
epidemiologist_prompt (str, optional): Custom prompt for the epidemiologist agent.
|
81 |
+
doctor_critic_prompt (str, optional): Custom prompt for the doctor critic agent.
|
82 |
+
|
83 |
+
Returns:
|
84 |
+
str: The defined criteria for patient outreach.
|
85 |
+
"""
|
86 |
+
llm_config_llama: Dict[str, Any] = {
|
87 |
+
"cache_seed": 41,
|
88 |
+
"temperature": 0,
|
89 |
+
"config_list": config_list_llama,
|
90 |
+
"timeout": 120,
|
91 |
+
}
|
92 |
+
|
93 |
+
llm_config_deepseek: Dict[str, Any] = {
|
94 |
+
"cache_seed": 42,
|
95 |
+
"temperature": 0,
|
96 |
+
"config_list": config_list_deepseek,
|
97 |
+
"timeout": 120,
|
98 |
+
}
|
99 |
+
|
100 |
+
user_proxy = UserProxyAgent(
|
101 |
+
name="User",
|
102 |
+
is_termination_msg=lambda x: (
|
103 |
+
x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE")
|
104 |
+
),
|
105 |
+
human_input_mode="NEVER",
|
106 |
+
description=user_proxy_prompt, # Use custom prompt
|
107 |
+
code_execution_config=False,
|
108 |
+
max_consecutive_auto_reply=1,
|
109 |
+
)
|
110 |
+
|
111 |
+
epidemiologist = AssistantAgent(
|
112 |
+
name="Epidemiologist",
|
113 |
+
system_message=epidemiologist_prompt, # Use custom prompt
|
114 |
+
llm_config=llm_config_llama,
|
115 |
+
code_execution_config=False,
|
116 |
+
is_termination_msg=lambda x: (
|
117 |
+
x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE")
|
118 |
+
),
|
119 |
+
)
|
120 |
+
|
121 |
+
critic = AssistantAgent(
|
122 |
+
name="DoctorCritic",
|
123 |
+
system_message=doctor_critic_prompt, # Use custom prompt
|
124 |
+
llm_config=llm_config_deepseek,
|
125 |
+
human_input_mode="NEVER",
|
126 |
+
code_execution_config=False,
|
127 |
+
is_termination_msg=lambda x: (
|
128 |
+
x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE")
|
129 |
+
),
|
130 |
+
)
|
131 |
+
|
132 |
+
groupchat = GroupChat(
|
133 |
+
agents=[user_proxy, epidemiologist, critic],
|
134 |
+
messages=[]
|
135 |
+
)
|
136 |
+
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config_llama)
|
137 |
+
|
138 |
+
user_proxy.initiate_chat(
|
139 |
+
manager,
|
140 |
+
message=target_screening,
|
141 |
+
)
|
142 |
+
if log_fn:
|
143 |
+
log_fn("Agent conversation complete.")
|
144 |
+
user_proxy.stop_reply_at_receive(manager)
|
145 |
+
result = user_proxy.last_message()["content"]
|
146 |
+
if log_fn:
|
147 |
+
log_fn(f"Criteria result: {result}")
|
148 |
+
return result
|
149 |
+
|
150 |
+
def get_patients_from_criteria(
|
151 |
+
patients_file: str,
|
152 |
+
min_age: int,
|
153 |
+
max_age: int,
|
154 |
+
criteria: str,
|
155 |
+
gender: str
|
156 |
+
) -> pd.DataFrame:
|
157 |
+
"""
|
158 |
+
Filters patient data from a CSV file based on specified criteria.
|
159 |
+
|
160 |
+
This function reads patient data from a CSV file and filters it based on
|
161 |
+
age range, gender, and a specific condition.
|
162 |
+
|
163 |
+
Example:
|
164 |
+
|
165 |
+
filtered_patients = get_patients_from_criteria(
|
166 |
+
patients_file="data/patients.csv",
|
167 |
+
min_age=40,
|
168 |
+
max_age=70,
|
169 |
+
criteria="Adenomatous Polyps",
|
170 |
+
gender="None"
|
171 |
+
)
|
172 |
+
|
173 |
+
Args:
|
174 |
+
patients_file (str): Path to the CSV file containing patient data.
|
175 |
+
min_age (int): Minimum age for filtering.
|
176 |
+
max_age (int): Maximum age for filtering.
|
177 |
+
criteria (str): Condition to filter patients by.
|
178 |
+
gender (str, optional): Gender to filter patients by. Defaults to None.
|
179 |
+
|
180 |
+
Returns:
|
181 |
+
pd.DataFrame: A DataFrame containing the filtered patient data.
|
182 |
+
"""
|
183 |
+
required_columns = [
|
184 |
+
'patient_id', 'First Name', 'Last Name', 'Email',
|
185 |
+
'Patient diagnosis summary', 'age', 'gender', 'condition'
|
186 |
+
]
|
187 |
+
|
188 |
+
# Support both file path (str) and file-like object (e.g., from Streamlit)
|
189 |
+
if hasattr(patients_file, "read"):
|
190 |
+
# Reset pointer in case it's been read before
|
191 |
+
patients_file.seek(0)
|
192 |
+
patients_df = pd.read_csv(patients_file)
|
193 |
+
else:
|
194 |
+
patients_df = pd.read_csv(patients_file)
|
195 |
+
|
196 |
+
for column in required_columns:
|
197 |
+
if column not in patients_df.columns:
|
198 |
+
raise ValueError(f"Missing required column: {column}")
|
199 |
+
|
200 |
+
# Ensure all text is lowercase for case-insensitive matching
|
201 |
+
patients_df['condition'] = patients_df['condition'].str.lower()
|
202 |
+
criteria = criteria.lower()
|
203 |
+
|
204 |
+
# Filter by condition matching
|
205 |
+
condition_filter = patients_df['condition'].str.contains(criteria, na=False)
|
206 |
+
|
207 |
+
# Filter by age range
|
208 |
+
age_filter = (patients_df['age'] >= min_age) & (patients_df['age'] <= max_age)
|
209 |
+
|
210 |
+
# Combine filters with OR logic
|
211 |
+
combined_filter = age_filter | condition_filter
|
212 |
+
|
213 |
+
if gender in ['M', 'F']:
|
214 |
+
gender_filter = patients_df['gender'].str.upper() == gender.upper()
|
215 |
+
combined_filter = combined_filter & gender_filter
|
216 |
+
|
217 |
+
return patients_df[combined_filter]
|
218 |
+
|
219 |
+
def register_function(
|
220 |
+
assistant: AssistantAgent,
|
221 |
+
user_proxy: UserProxyAgent,
|
222 |
+
func: Callable,
|
223 |
+
name: str,
|
224 |
+
description: str
|
225 |
+
) -> None:
|
226 |
+
"""
|
227 |
+
This function allows an assistant agent and a user proxy agent to execute
|
228 |
+
a specified function.
|
229 |
+
|
230 |
+
Example:
|
231 |
+
register_function(
|
232 |
+
assistant=assistant_agent,
|
233 |
+
user_proxy=user_proxy_agent,
|
234 |
+
func=my_function,
|
235 |
+
name="my_function",
|
236 |
+
description="This is a test function."
|
237 |
+
)
|
238 |
+
|
239 |
+
Args:
|
240 |
+
assistant (AssistantAgent): The assistant agent to register the function.
|
241 |
+
user_proxy (UserProxyAgent): The user proxy agent to register the function.
|
242 |
+
func (Callable): The function to register.
|
243 |
+
name (str): The name of the function.
|
244 |
+
description (str): A description of the function.
|
245 |
+
"""
|
246 |
+
|
247 |
+
assistant.register_for_llm(
|
248 |
+
name=name,
|
249 |
+
description=description
|
250 |
+
)(func)
|
251 |
+
|
252 |
+
user_proxy.register_for_execution(
|
253 |
+
name=name
|
254 |
+
)(func)
|
255 |
+
|
256 |
+
return None
|
257 |
+
|
258 |
+
async def find_patients(
|
259 |
+
criteria: str,
|
260 |
+
config_list_llama: Dict[str, Any],
|
261 |
+
log_fn=None,
|
262 |
+
patients_file_path=None # Can be a path or a file-like object
|
263 |
+
) -> pd.DataFrame:
|
264 |
+
"""
|
265 |
+
Finds patients matching specific criteria using agents.
|
266 |
+
|
267 |
+
This function uses a user proxy agent and a data analyst agent to filter
|
268 |
+
patient data based on the provided criteria.
|
269 |
+
|
270 |
+
Example:
|
271 |
+
patients_df = asyncio.run(find_patients(criteria="Patients aged 40 to 70"))
|
272 |
+
|
273 |
+
Args:
|
274 |
+
criteria (str): The criteria for filtering patients.
|
275 |
+
config_list_llama (Dict[str, Any]): Configuration for the Llama model.
|
276 |
+
log_fn (callable, optional): Function for logging messages.
|
277 |
+
patients_file_path: Path to patient data file or file-like object.
|
278 |
+
|
279 |
+
Returns:
|
280 |
+
pd.DataFrame: A DataFrame containing the filtered patient data.
|
281 |
+
"""
|
282 |
+
# Set up a temporary file path for the agent to use
|
283 |
+
temp_file_path = None
|
284 |
+
|
285 |
+
# If we have a file-like object (from Streamlit), save it to a temp file
|
286 |
+
if patients_file_path is not None and hasattr(patients_file_path, "read"):
|
287 |
+
try:
|
288 |
+
# Create data directory if it doesn't exist
|
289 |
+
os.makedirs("data", exist_ok=True)
|
290 |
+
temp_file_path = os.path.join("data", "temp_patients.csv")
|
291 |
+
|
292 |
+
# Reset the file pointer and read with pandas
|
293 |
+
patients_file_path.seek(0)
|
294 |
+
temp_df = pd.read_csv(patients_file_path)
|
295 |
+
|
296 |
+
# Save to the temp location
|
297 |
+
temp_df.to_csv(temp_file_path, index=False)
|
298 |
+
|
299 |
+
if log_fn:
|
300 |
+
log_fn(f"Saved uploaded file to temporary location: {temp_file_path}")
|
301 |
+
|
302 |
+
# Update the criteria to include the file path
|
303 |
+
criteria = f"The patient data is available at {temp_file_path}. " + criteria
|
304 |
+
except Exception as e:
|
305 |
+
if log_fn:
|
306 |
+
log_fn(f"Error preparing patient file: {str(e)}")
|
307 |
+
raise
|
308 |
+
elif isinstance(patients_file_path, str):
|
309 |
+
# It's a regular file path
|
310 |
+
temp_file_path = patients_file_path
|
311 |
+
criteria = f"The patient data is available at {temp_file_path}. " + criteria
|
312 |
+
|
313 |
+
# Configure the LLM
|
314 |
+
llm_config_llama: Dict[str, Any] = {
|
315 |
+
"cache_seed": 43,
|
316 |
+
"temperature": 0,
|
317 |
+
"config_list": config_list_llama,
|
318 |
+
"timeout": 120,
|
319 |
+
"tools": []
|
320 |
+
}
|
321 |
+
|
322 |
+
user_proxy = UserProxyAgent(
|
323 |
+
name="user_proxy",
|
324 |
+
code_execution_config={"last_n_messages": 2, "work_dir": "data/", "use_docker": False},
|
325 |
+
is_termination_msg=lambda x: x.get("content", "") and x.get(
|
326 |
+
"content", "").rstrip().endswith("TERMINATE"),
|
327 |
+
human_input_mode="NEVER",
|
328 |
+
llm_config=llm_config_llama,
|
329 |
+
# reflect_on_tool_use=True
|
330 |
+
)
|
331 |
+
|
332 |
+
data_analyst = AssistantAgent(
|
333 |
+
name="data_analyst",
|
334 |
+
code_execution_config={
|
335 |
+
"last_n_messages": 2,
|
336 |
+
"work_dir": "data/",
|
337 |
+
"use_docker": False},
|
338 |
+
llm_config=llm_config_llama,
|
339 |
+
# reflect_on_tool_use=True
|
340 |
+
)
|
341 |
+
|
342 |
+
register_function(
|
343 |
+
data_analyst,
|
344 |
+
user_proxy,
|
345 |
+
get_patients_from_criteria,
|
346 |
+
"get_patients_from_criteria",
|
347 |
+
"Extract and filter patient information based on criteria."
|
348 |
+
)
|
349 |
+
# --- Fix: Properly extract arguments from the agent conversation ---
|
350 |
+
arguments = None # Ensure arguments is defined in this scope
|
351 |
+
|
352 |
+
def user_proxy_reply(message: str):
|
353 |
+
nonlocal temp_file_path
|
354 |
+
try:
|
355 |
+
if "arguments:" in message:
|
356 |
+
arguments_str = message.split("arguments:")[1].strip().split("\n")[0]
|
357 |
+
args = eval(arguments_str)
|
358 |
+
|
359 |
+
# Override the file path with our temp file if available
|
360 |
+
if temp_file_path:
|
361 |
+
args['patients_file'] = temp_file_path
|
362 |
+
if log_fn:
|
363 |
+
log_fn(f"Using patient data from: {temp_file_path}")
|
364 |
+
|
365 |
+
return "Tool call received. \nTERMINATE", args
|
366 |
+
except Exception as e:
|
367 |
+
if log_fn:
|
368 |
+
log_fn(f"Error extracting arguments: {e}")
|
369 |
+
return f"Error executing function: {str(e)} \nTERMINATE"
|
370 |
+
return "Function call not recognized. \nTERMINATE"
|
371 |
+
|
372 |
+
user_proxy.reply_handler = user_proxy_reply
|
373 |
+
if log_fn:
|
374 |
+
log_fn(f"Set up reply handler with temp file path: {temp_file_path}")
|
375 |
+
|
376 |
+
groupchat = GroupChat(agents=[user_proxy, data_analyst], messages=[])
|
377 |
+
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config_llama)
|
378 |
+
|
379 |
+
chat_output = user_proxy.initiate_chat(data_analyst, message=f"{criteria}")
|
380 |
+
user_proxy.stop_reply_at_receive(manager)
|
381 |
+
if log_fn:
|
382 |
+
log_fn("Agent conversation for patient filtering complete.")
|
383 |
+
|
384 |
+
# Always extract arguments from chat history after chat
|
385 |
+
if chat_output and hasattr(chat_output, "chat_history"):
|
386 |
+
chat_history = chat_output.chat_history
|
387 |
+
for message in chat_history:
|
388 |
+
if "tool_calls" in message:
|
389 |
+
tool_calls = message["tool_calls"]
|
390 |
+
for tool_call in tool_calls:
|
391 |
+
function = tool_call.get("function", {})
|
392 |
+
try:
|
393 |
+
arguments = json.loads(function.get("arguments", None))
|
394 |
+
except Exception:
|
395 |
+
arguments = None
|
396 |
+
if arguments:
|
397 |
+
break
|
398 |
+
if arguments:
|
399 |
+
break
|
400 |
+
|
401 |
+
if not arguments:
|
402 |
+
if log_fn:
|
403 |
+
log_fn("Arguments were not populated during the chat process.")
|
404 |
+
raise ValueError("Arguments were not populated during the chat process.")
|
405 |
+
|
406 |
+
# Always use the temp file path for the actual data load if available
|
407 |
+
if temp_file_path and arguments:
|
408 |
+
arguments['patients_file'] = temp_file_path
|
409 |
+
|
410 |
+
filtered_df = get_patients_from_criteria(
|
411 |
+
patients_file=arguments['patients_file'],
|
412 |
+
min_age=arguments['min_age'],
|
413 |
+
max_age=arguments['max_age'],
|
414 |
+
criteria=arguments['criteria'],
|
415 |
+
gender=arguments['gender']
|
416 |
+
)
|
417 |
+
if log_fn:
|
418 |
+
log_fn(f"Filtered {len(filtered_df)} patients.")
|
419 |
+
return filtered_df, arguments
|
420 |
+
|
421 |
+
async def generate_email(openai_client, patient, email_prompt, model):
|
422 |
+
"""
|
423 |
+
Asynchronously generate an email using the OpenAI client.
|
424 |
+
|
425 |
+
Args:
|
426 |
+
openai_client (OpenAI): The OpenAI client instance.
|
427 |
+
patient (dict): The patient data.
|
428 |
+
email_prompt (str): The email prompt to send to the model.
|
429 |
+
model (str): The model to use for generation.
|
430 |
+
|
431 |
+
Returns:
|
432 |
+
str: The generated email content.
|
433 |
+
"""
|
434 |
+
# Wrap the synchronous `create` method in an async function
|
435 |
+
create_completion = functools.partial(
|
436 |
+
openai_client.chat.completions.create,
|
437 |
+
model=model, # Use model from the OpenAI client
|
438 |
+
messages=[{"role": "user", "content": email_prompt}],
|
439 |
+
stream=False,
|
440 |
+
seed=42,
|
441 |
+
temperature=0 # Ensures a consistent output for email (limiting creativity)
|
442 |
+
)
|
443 |
+
chat_completion = await asyncio.get_event_loop().run_in_executor(None, create_completion)
|
444 |
+
return chat_completion.choices[0].message.content
|
445 |
+
|
446 |
+
|
447 |
+
async def write_email_to_file(file_path, patient, email_content):
|
448 |
+
"""
|
449 |
+
Asynchronously write an email to a file.
|
450 |
+
|
451 |
+
Args:
|
452 |
+
file_path (str): The path to the file.
|
453 |
+
patient (dict): The patient data.
|
454 |
+
email_content (str): The email content to write.
|
455 |
+
|
456 |
+
Returns:
|
457 |
+
None
|
458 |
+
"""
|
459 |
+
async with aiofiles.open(file_path, "w") as f:
|
460 |
+
await f.write(f"Name: {patient['First Name']} {patient['Last Name']}\n")
|
461 |
+
await f.write(f"Patient ID: {patient['patient_id']}\n")
|
462 |
+
await f.write(f"Email: {patient['Email']}\n")
|
463 |
+
await f.write(email_content)
|
464 |
+
await f.write("\n")
|
465 |
+
await f.write("-----------------------------------------")
|
466 |
+
|
467 |
+
|
468 |
+
async def write_outreach_emails(
|
469 |
+
patient_details: pd.DataFrame,
|
470 |
+
user_proposal: str,
|
471 |
+
arguments_criteria: Dict[str, Any],
|
472 |
+
openai_client: OpenAI,
|
473 |
+
model: str,
|
474 |
+
phone: str = "123-456-7890",
|
475 |
+
email: str = "[email protected]",
|
476 |
+
name: str = "Benjamin Consolvo",
|
477 |
+
log_fn=None,
|
478 |
+
outreach_email_prompt_template=OUTREACH_EMAIL_PROMPT_TEMPLATE
|
479 |
+
) -> None:
|
480 |
+
"""
|
481 |
+
Asynchronously generates and writes outreach emails for patients.
|
482 |
+
|
483 |
+
This function generates personalized emails for patients based on their
|
484 |
+
details and the specified screening criteria. The emails are written to
|
485 |
+
individual text files asynchronously.
|
486 |
+
|
487 |
+
Args:
|
488 |
+
patient_details (pd.DataFrame): DataFrame containing patient details.
|
489 |
+
user_proposal (str): The type of screening task (e.g., "Colonoscopy screening").
|
490 |
+
arguments_criteria (Dict[str, Any]): The criteria used for filtering patients.
|
491 |
+
openai_client (OpenAI): The OpenAI client instance.
|
492 |
+
model (str): Model name to use for generation.
|
493 |
+
phone (str): Phone number to include in the outreach emails.
|
494 |
+
email (str): Email address to include in the outreach emails.
|
495 |
+
name (str): Name to include in the outreach emails.
|
496 |
+
log_fn (callable, optional): Function for logging messages.
|
497 |
+
outreach_email_prompt_template (str): Custom template for outreach emails.
|
498 |
+
|
499 |
+
Returns:
|
500 |
+
None
|
501 |
+
"""
|
502 |
+
os.makedirs("data", exist_ok=True)
|
503 |
+
if patient_details.empty:
|
504 |
+
msg = "No patients found"
|
505 |
+
print(msg)
|
506 |
+
if log_fn:
|
507 |
+
log_fn(msg)
|
508 |
+
return
|
509 |
+
|
510 |
+
async def process_patient(patient):
|
511 |
+
# Ensure all required fields are present in the patient record
|
512 |
+
required_fields = ['First Name', 'Last Name', 'patient_id', 'Email']
|
513 |
+
for field in required_fields:
|
514 |
+
if field not in patient or pd.isna(patient[field]):
|
515 |
+
msg = f"Skipping patient record due to missing field: {field}"
|
516 |
+
print(msg)
|
517 |
+
if log_fn:
|
518 |
+
log_fn(msg)
|
519 |
+
return
|
520 |
+
|
521 |
+
# Validate the prompt template
|
522 |
+
try:
|
523 |
+
# Use the custom template instead of the default
|
524 |
+
email_prompt = outreach_email_prompt_template.format(
|
525 |
+
patient=patient.to_dict(),
|
526 |
+
arguments_criteria=arguments_criteria,
|
527 |
+
first_name=patient["First Name"],
|
528 |
+
last_name=patient["Last Name"],
|
529 |
+
user_proposal=user_proposal,
|
530 |
+
name=name,
|
531 |
+
phone=phone,
|
532 |
+
email=email
|
533 |
+
)
|
534 |
+
except KeyError as e:
|
535 |
+
msg = f"Error formatting email prompt: Missing key {e}. Skipping patient."
|
536 |
+
print(msg)
|
537 |
+
if log_fn:
|
538 |
+
log_fn(msg)
|
539 |
+
return
|
540 |
+
|
541 |
+
msg = f'Generating email for {patient["First Name"]} {patient["Last Name"]}'
|
542 |
+
print(msg)
|
543 |
+
if log_fn:
|
544 |
+
log_fn(msg)
|
545 |
+
email_content = await generate_email(openai_client, patient, email_prompt, model)
|
546 |
+
|
547 |
+
file_path = f"data/{patient['First Name']}_{patient['Last Name']}_email.txt"
|
548 |
+
await write_email_to_file(file_path, patient, email_content)
|
549 |
+
if log_fn:
|
550 |
+
log_fn(f"Wrote email to {file_path}")
|
551 |
+
|
552 |
+
tasks = [process_patient(patient) for _, patient in patient_details.iterrows()]
|
553 |
+
await asyncio.gather(*tasks)
|
554 |
+
|
555 |
+
msg = f"All emails have been written to the 'data/' directory."
|
556 |
+
print(msg)
|
557 |
+
if log_fn:
|
558 |
+
log_fn(msg)
|
559 |
+
|
560 |
+
def parse_arguments():
|
561 |
+
"""
|
562 |
+
Parse command-line arguments for the script.
|
563 |
+
|
564 |
+
Returns:
|
565 |
+
argparse.Namespace: Parsed arguments.
|
566 |
+
"""
|
567 |
+
parser = argparse.ArgumentParser(description="Run the Preventative Healthcare Intel script.")
|
568 |
+
parser.add_argument(
|
569 |
+
"--oai_config",
|
570 |
+
type=str,
|
571 |
+
required=True,
|
572 |
+
help="Path to the OAI_CONFIG_LIST.json file."
|
573 |
+
)
|
574 |
+
parser.add_argument(
|
575 |
+
"--target_screening",
|
576 |
+
type=str,
|
577 |
+
required=True,
|
578 |
+
help="The type of screening task (e.g., 'Colonoscopy screening')."
|
579 |
+
)
|
580 |
+
parser.add_argument(
|
581 |
+
"--patients_file",
|
582 |
+
type=str,
|
583 |
+
default="data/patients.csv",
|
584 |
+
help="Path to the CSV file containing patient data. Default is 'data/patients.csv'."
|
585 |
+
)
|
586 |
+
parser.add_argument(
|
587 |
+
"--phone",
|
588 |
+
type=str,
|
589 |
+
default="123-456-7890",
|
590 |
+
help="Phone number to include in the outreach emails. Default is '123-456-7890'."
|
591 |
+
)
|
592 |
+
parser.add_argument(
|
593 |
+
"--email",
|
594 |
+
type=str,
|
595 |
+
default="[email protected]",
|
596 |
+
help="Email address to include in the outreach emails. Default is '[email protected]'."
|
597 |
+
)
|
598 |
+
parser.add_argument(
|
599 |
+
"--name",
|
600 |
+
type=str,
|
601 |
+
default="Benjamin Consolvo",
|
602 |
+
help="Name to include in the outreach emails. Default is 'Benjamin Consolvo'."
|
603 |
+
)
|
604 |
+
return parser.parse_args()
|
605 |
+
|
606 |
+
if __name__ == "__main__":
|
607 |
+
# Parse command-line arguments
|
608 |
+
args = parse_arguments()
|
609 |
+
|
610 |
+
llama_filter_dict = {"model": ["meta-llama/Llama-3.3-70B-Instruct"]}
|
611 |
+
config_list_llama = get_configs(args.oai_config,llama_filter_dict)
|
612 |
+
|
613 |
+
deepseek_filter_dict = {"model": ["deepseek-ai/DeepSeek-R1-Distill-Llama-70B"]}
|
614 |
+
config_list_deepseek = get_configs(args.oai_config,deepseek_filter_dict)
|
615 |
+
|
616 |
+
# Validate API key before initializing OpenAI client
|
617 |
+
api_key = config_list_llama[0].get('api_key')
|
618 |
+
|
619 |
+
if not api_key:
|
620 |
+
config_list_llama[0]['api_key'] = config_list_deepseek[0]['api_key'] = api_key = os.environ.get("OPENAI_API_KEY")
|
621 |
+
|
622 |
+
# Get the criteria for the target screening
|
623 |
+
# The user provides the screening task.
|
624 |
+
# The epidemiologist and doctor critic will then define the criteria for the outreach.
|
625 |
+
filepath = os.path.join(os.getcwd(), args.patients_file)
|
626 |
+
criteria = f"The patient data is located here: {filepath}."
|
627 |
+
criteria += asyncio.run(target_patients_outreach(args.target_screening,config_list_llama, config_list_deepseek))
|
628 |
+
|
629 |
+
# The user proxy agent and data analyst
|
630 |
+
# will filter the patients based on the criteria defined by the epidemiologist and doctor critic.
|
631 |
+
patients_df, arguments_criteria = asyncio.run(find_patients(criteria,config_list_llama, patient_data_path=filepath))
|
632 |
+
|
633 |
+
# Initialize OpenAI client
|
634 |
+
openai_client = OpenAI(
|
635 |
+
api_key=api_key,
|
636 |
+
base_url=config_list_llama[0]['base_url']
|
637 |
+
)
|
638 |
+
|
639 |
+
#Use LLM to write the outreach emails to text files.
|
640 |
+
asyncio.run(write_outreach_emails(
|
641 |
+
patients_df,
|
642 |
+
args.target_screening,
|
643 |
+
arguments_criteria,
|
644 |
+
openai_client,
|
645 |
+
config_list_llama[0]['model'],
|
646 |
+
phone=args.phone,
|
647 |
+
email=args.email,
|
648 |
+
name=args.name
|
649 |
+
))
|
pyproject.toml
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[project]
|
2 |
+
name = "agentchat-intel-preventative-healthcare"
|
3 |
+
version = "0.1.0"
|
4 |
+
description = "AutoGen Agents for Preventative Healthcare"
|
5 |
+
readme = "README.md"
|
6 |
+
requires-python = ">=3.10"
|
7 |
+
dependencies = [
|
8 |
+
"aiofiles>=24.1.0",
|
9 |
+
"anyio>=4.9.0",
|
10 |
+
"argparse>=1.4.0",
|
11 |
+
"asyncio>=3.4.3",
|
12 |
+
"autogen>=0.9",
|
13 |
+
"autogen-ext[openai]>=0.5.6",
|
14 |
+
"distro>=1.9.0",
|
15 |
+
"litellm[proxy]>=1.68.0",
|
16 |
+
"markitdown>=0.1.1",
|
17 |
+
"openai>=1.75.0",
|
18 |
+
"pandas>=2.2.3",
|
19 |
+
"streamlit>=1.25.0",
|
20 |
+
]
|
requirements.txt
CHANGED
@@ -1,3 +1,12 @@
|
|
1 |
-
|
|
|
|
|
|
|
|
|
|
|
2 |
pandas
|
|
|
|
|
|
|
|
|
3 |
streamlit
|
|
|
1 |
+
distro
|
2 |
+
autogen
|
3 |
+
autogen-ext[openai]
|
4 |
+
litellm[proxy]
|
5 |
+
anyio
|
6 |
+
markitdown
|
7 |
pandas
|
8 |
+
aiofiles
|
9 |
+
argparse
|
10 |
+
openai
|
11 |
+
asyncio
|
12 |
streamlit
|
src/streamlit_app.py
DELETED
@@ -1,40 +0,0 @@
|
|
1 |
-
import altair as alt
|
2 |
-
import numpy as np
|
3 |
-
import pandas as pd
|
4 |
-
import streamlit as st
|
5 |
-
|
6 |
-
"""
|
7 |
-
# Welcome to Streamlit!
|
8 |
-
|
9 |
-
Edit `/streamlit_app.py` to customize this app to your heart's desire :heart:.
|
10 |
-
If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
|
11 |
-
forums](https://discuss.streamlit.io).
|
12 |
-
|
13 |
-
In the meantime, below is an example of what you can do with just a few lines of code:
|
14 |
-
"""
|
15 |
-
|
16 |
-
num_points = st.slider("Number of points in spiral", 1, 10000, 1100)
|
17 |
-
num_turns = st.slider("Number of turns in spiral", 1, 300, 31)
|
18 |
-
|
19 |
-
indices = np.linspace(0, 1, num_points)
|
20 |
-
theta = 2 * np.pi * num_turns * indices
|
21 |
-
radius = indices
|
22 |
-
|
23 |
-
x = radius * np.cos(theta)
|
24 |
-
y = radius * np.sin(theta)
|
25 |
-
|
26 |
-
df = pd.DataFrame({
|
27 |
-
"x": x,
|
28 |
-
"y": y,
|
29 |
-
"idx": indices,
|
30 |
-
"rand": np.random.randn(num_points),
|
31 |
-
})
|
32 |
-
|
33 |
-
st.altair_chart(alt.Chart(df, height=700, width=700)
|
34 |
-
.mark_point(filled=True)
|
35 |
-
.encode(
|
36 |
-
x=alt.X("x", axis=None),
|
37 |
-
y=alt.Y("y", axis=None),
|
38 |
-
color=alt.Color("idx", legend=None, scale=alt.Scale()),
|
39 |
-
size=alt.Size("rand", legend=None, scale=alt.Scale(range=[1, 150])),
|
40 |
-
))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|