Requirements:

pip install opencv-python
pip install albumentations
pip install accelerate
torch==2.2.1
transformers==4.39.0 # may work with more recent version

Adapted sample script for SRRG

import io
import requests
import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoTokenizer
import tempfile

# step 1: Setup constants
model_name = "StanfordAIMI/CheXagent-2-3b-srrg-findings"
dtype = torch.bfloat16
device = "cuda"

# step 2: Load Processor and Model
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", trust_remote_code=True)
model = model.to(dtype)
model.eval()

# step 3: Download image from URL, save to a local file, and prepare path list
url = "https://huggingface.co/IAMJB/interpret-cxr-impression-baseline/resolve/main/effusions-bibasal.jpg"
resp = requests.get(url)
resp.raise_for_status()

# Use a NamedTemporaryFile so it lives on disk
with tempfile.NamedTemporaryFile(delete=False, suffix=".jpg") as tmpfile:
    tmpfile.write(resp.content)
    local_path = tmpfile.name  # this is a real file path on disk

paths = [local_path]

prompt = "Structured Radiology Report Generation for Findings Section"
# build the multimodal input
query = tokenizer.from_list_format(
    [*([{"image": img} for img in paths]), {"text": prompt}]
)

# format as a chat conversation
conv = [
    {"from": "system", "value": "You are a helpful assistant."},
    {"from": "human", "value": query},
]

# tokenize and generate
input_ids = tokenizer.apply_chat_template(
    conv, add_generation_prompt=True, return_tensors="pt"
)
output = model.generate(
    input_ids.to(device),
    do_sample=False,
    num_beams=1,
    temperature=1.0,
    top_p=1.0,
    use_cache=True,
    max_new_tokens=512,
)[0]

# decode the “findings” text
response = tokenizer.decode(output[input_ids.size(1) : -1])
print(response)

Response:

Lungs and Airways:
- No evidence of pneumothorax.

Pleura:
- Bilateral pleural effusions.

Cardiovascular:
- Cardiomegaly.

Other:
- Bibasilar opacities.
- Mild pulmonary edema.
Downloads last month
190
Safetensors
Model size
3.14B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collections including StanfordAIMI/CheXagent-2-3b-srrg-findings