The default pipeline tag for text-to-video is available. (#2)

7b90ac2 verified 5 months ago

4.54 kB

	---
	tags:
	- text-to-image
	- lora
	- template:diffusion-lora
	widget:
	- text: >-
	steamboat willie style, golden era animation, a stylish woman walks down a
	Tokyo street filled with warm glowing neon and animated city signage. She
	wears a black leather jacket, a long red dress, and black boots, and
	carries a black purse. She wears sunglasses and red lipstick. She walks
	confidently and casually. The street is damp and reflective, creating a
	mirror effect of the colorful lights. Many pedestrians walk about.
	parameters:
	negative_prompt: >-
	色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走
	output:
	url: videos/t2v-1.webp
	- text: >-
	steamboat willie style, golden era animation, close-up of a short fluffy
	monster kneeling beside a melting red candle. the mood is one of wonder and
	curiosity, as the monster gazes at the flame with wide eyes and open mouth.
	Its pose and expression convey a sense of innocence and playfulness, as if
	it is exploring the world around it for the first time. The use of warm
	colors and dramatic lighting further enhances the cozy atmosphere of the
	image.
	parameters:
	negative_prompt: >-
	色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走
	output:
	url: videos/t2v-2.webp
	base_model: Wan-AI/Wan2.1-T2V-14B
	instance_prompt: steamboat willie style, golden era animation
	license: cc0-1.0
	pipeline_tag: text-to-video
	library_name: diffusers
	---

	# Steamboat Willie LoRA

	<Gallery />

	## Model Description

	Trained on clips from [Steamboat Willie](https://archive.org/details/steamboat-willie-mickey), split by scene and captioned using [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct).

	Also available for [Wan2.1-T2V-1.3B](https://huggingface.co/benjamin-paine/steamboat-willie-1.3b).

	Additionally hosted [on CivitAI](https://civitai.com/models/1357058?modelVersionId=1532988).

	## Trigger Words

	The model was trained with the trigger phrase "steamboat willie style". I find best results from using this trigger phrase combined with "golden era animation".

	## Using with Diffusers
	```py
	pip install git+https://github.com/huggingface/diffusers.git
	```

	```py
	import torch
	from diffusers.utils import export_to_video
	from diffusers import AutoencoderKLWan, WanPipeline
	from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler

	model_id = "Wan-AI/Wan2.1-T2V-14B-Diffusers"
	vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
	pipe = WanPipeline.from_pretrained(model_id, vae=vae, torch_dtype=torch.bfloat16)
	pipe.scheduler = UniPCMultistepScheduler.from_config(
	pipe.scheduler.config,
	flow_shift=5.0
	)
	pipe.to("cuda")
	pipe.load_lora_weights("benjamin-paine/steamboat-willie-14b")
	pipe.enable_model_cpu_offload() # for low-vram environments

	prompt = "steamboat willie style, golden era animation, an anthropomorphic cat character wearing a hat removes it and performs a courteous bow"
	negative_prompt = "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走"
	output = pipe(
	prompt=prompt,
	negative_prompt=negative_prompt,
	height=720,
	width=1280,
	num_frames=81,
	guidance_scale=5.0,
	num_inference_steps=32
	).frames[0]
	export_to_video(output, "output.mp4", fps=16)
	```

	## Download Model

	Weights for this model are available in Safetensors format.

	[Download](/benjamin-paine/steamboat-willie-14b/tree/main) them in the Files & versions tab.