pixart / README.md

Update README.md

623e413 verified about 2 months ago

3.86 kB

	---
	license: openrail++
	language:
	- en
	base_model:
	- PixArt-alpha/PixArt-XL-2-1024-MS
	pipeline_tag: text-to-image
	tags:
	- pixart
	- gguf-node
	widget:
	- text: a close-up shot of a beautiful girl in a serene world. She has white hair
	and is blindfolded, with a calm expression. Her hands are pressed together in
	a prayer pose, with fingers interlaced and palms touching. The background is softly
	blurred, enhancing her ethereal presence.
	parameters:
	negative_prompt: blurry, cropped, ugly
	output:
	url: samples\ComfyUI_00007_.png
	- text: a wizard with a glowing staff and a glowing hat, colorful magic, dramatic
	atmosphere, sharp focus, highly detailed, cinematic, original composition, fine
	detail, intricate, elegant, creative, color spread, shiny, amazing, symmetry,
	illuminated, inspired, pretty, attractive, artistic, dynamic background, relaxed,
	professional, extremely inspirational, beautiful, determined, cute, adorable,
	best
	parameters:
	negative_prompt: blurry, cropped, ugly
	output:
	url: samples\ComfyUI_00008_.png
	- text: a girl stands amidst scattered glass shards, surrounded by a beautifully crafted
	and expansive world. The scene is depicted from a dynamic angle, emphasizing her
	determined expression. The background features vast landscapes with floating crystals
	and soft, glowing lights that create a mystical and grand atmosphere.
	parameters:
	negative_prompt: blurry, cropped, ugly
	output:
	url: samples\ComfyUI_00009_.png
	- text: close-up portrait of girl
	output:
	url: samples\ComfyUI_00001_.png
	- text: close-up portrait of cat
	output:
	url: samples\ComfyUI_00002_.png
	- text: close-up portrait of young lady
	output:
	url: samples\ComfyUI_00003_.png
	---

	# gguf quantized version of pixart

	<Gallery />

	## setup (once)
	- drag pixart-xl-2-1024-ms-q4_k_m.gguf [[1GB](https://huggingface.co/calcuis/pixart/blob/main/pixart-xl-2-1024-ms-q4_k_m.gguf)] to > ./ComfyUI/models/diffusion_models
	- drag t5xxl_fp16-q4_0.gguf [[2.9GB](https://huggingface.co/calcuis/pixart/blob/main/t5xxl_fp16-q4_0.gguf)] to > ./ComfyUI/models/text_encoders
	- drag pixart_vae_fp8_e4m3fn.safetensors [[83.7MB](https://huggingface.co/calcuis/pixart/blob/main/pixart_vae_fp8_e4m3fn.safetensors)] to > ./ComfyUI/models/vae

	## run it straight (no installation needed way)
	- run the .bat file in the main directory (assuming you are using the gguf-node [pack](https://github.com/calcuis/gguf/releases) below)
	- drag the workflow json file (below) or the demo picture above to > your browser

	### workflow
	- example workflow for [gguf](https://huggingface.co/calcuis/pixart/blob/main/workflow-pixart-gguf.json)
	- example workflow for [safetensors](https://huggingface.co/calcuis/pixart/blob/main/workflow-pixart-safetensors.json)

	### review
	- should set the output image size according to the model stated, i.e., 1024x1024 or 512x512
	- pixart-xl-2-1024-ms and pixart-sigma-xl-2-1024-ms are recommended (with 1024x1024 size)
	- small size model but good quality pictures; and t5 encoder allows you inputting short description or sentence instead of tag(s)
	- more quantized versions of t5xxl encoder can be found [here](https://huggingface.co/chatpig/t5xxl/tree/main)
	- upgrade your gguf-node (see the last item in reference list below) to the latest version for pixart model support

	### paper
	- [pixart-α](https://arxiv.org/pdf/2310.00426)
	- [pixart-Σ](https://arxiv.org/pdf/2403.04692)
	- [high-resolution image synthesis](https://arxiv.org/pdf/2112.10752)

	### reference
	- base model from [pixart-alpha](https://huggingface.co/PixArt-alpha)
	- comfyui [comfyanonymous](https://github.com/comfyanonymous/ComfyUI)
	- gguf-node ([pypi](https://pypi.org/project/gguf-node)\|[repo](https://github.com/calcuis/gguf)\|[pack](https://github.com/calcuis/gguf/releases))