File size: 2,514 Bytes
f667084
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
import os
import requests
from PIL import Image
import streamlit as st
import torch
from huggingface_hub import login
from transformers import AutoProcessor, AutoModelForCausalLM
from diffusers import DiffusionPipeline

# Hugging Face token setup
hf_token = os.getenv('HF_AUTH_TOKEN')
if not hf_token:
    raise ValueError("Hugging Face token is not set in the environment variables.")
login(token=hf_token)

# Initialize Stable Diffusion pipeline
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-3.5-medium")

# Initialize captioning model and processor
caption_model_name = "pretrained-caption-model"  # Replace with the actual model name
processor = AutoProcessor.from_pretrained(caption_model_name)
model = AutoModelForCausalLM.from_pretrained(caption_model_name)

# Move models to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe.to(device)
model.to(device)

# Streamlit UI
st.title("Image Caption and Design Generator")
st.write("Upload an image or provide an image URL to generate a caption and use it to create a similar design.")

# Image upload or URL input
img_file = st.file_uploader("Choose an image...", type=["png", "jpg", "jpeg"])
img_url = st.text_input("Or provide an image URL:")

# Process the image
raw_image = None
if img_file:
    raw_image = Image.open(img_file).convert("RGB")
    st.image(raw_image, caption="Uploaded Image", use_column_width=True)
elif img_url:
    try:
        raw_image = Image.open(requests.get(img_url, stream=True).raw).convert("RGB")
        st.image(raw_image, caption="Image from URL", use_column_width=True)
    except Exception as e:
        st.error(f"Error loading image from URL: {e}")

# Generate caption and design
if raw_image and st.button("Generate Caption and Design"):
    with st.spinner("Generating caption..."):
        # Generate caption
        inputs = processor(raw_image, return_tensors="pt", padding=True, truncation=True, max_length=250)
        inputs = {key: val.to(device) for key, val in inputs.items()}
        out = model.generate(**inputs)
        caption = processor.decode(out[0], skip_special_tokens=True)
        st.success("Generated Caption:")
        st.write(caption)

    with st.spinner("Generating similar design..."):
        # Generate similar design using the caption as a prompt
        generated_image = pipe(caption).images[0]
        st.success("Generated Design:")
        st.image(generated_image, caption="Design Generated from Caption", use_column_width=True)