File size: 2,207 Bytes
1d2fdbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
import gradio as gr
from transformers import pipeline

# Initialize pipelines (replace model names with ones available on Hugging Face)
# Story Generation Pipeline
story_generator = pipeline("text-generation", model="gpt2")  # GPT-2 for text generation

# Image Generation Pipeline (placeholder; use a model like Stable Diffusion if available)
# Note: As of now, Hugging Face's pipeline doesn't natively support text-to-image, so you may need diffusers library
from diffusers import StableDiffusionPipeline
image_generator = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
image_generator = image_generator.to("cpu")  # Use "cuda" if you have a GPU

# Text-to-Speech Pipeline
tts = pipeline("text-to-speech", model="facebook/tts_transformer-en-ljspeech")  # English TTS

def generate_story_image_audio(prompt):
    """
    Generate a story, an image, and audio based on the user's prompt.
    Args:
        prompt (str): The input prompt (e.g., "A brave little dragon").
    Returns:
        tuple: (story text, image, audio file path).
    """
    # Step 1: Generate the story
    story_output = story_generator(prompt, max_length=100, num_return_sequences=1, temperature=0.7)
    story = story_output[0]["generated_text"].strip()

    # Step 2: Generate an image based on the story
    image = image_generator(story, num_inference_steps=30).images[0]  # Generate one image

    # Step 3: Generate audio from the story
    audio_output = tts(story)  # Assuming the model returns audio data
    audio_path = "story_audio.wav"
    with open(audio_path, "wb") as f:
        f.write(audio_output["audio"])  # Save audio to a file

    return story, image, audio_path

# Create the Gradio interface
interface = gr.Interface(
    fn=generate_story_image_audio,
    inputs=gr.Textbox(label="Enter a story prompt (e.g., 'A brave little dragon')"),
    outputs=[
        gr.Textbox(label="Generated Story"),
        gr.Image(label="Story Illustration"),
        gr.Audio(label="Story Narration")
    ],
    title="Kids' Story Generator",
    description="Generate a short story, illustration, and audio narration for kids based on your prompt!"
)

# Launch the interface
interface.launch()