pixelnet / README.md
thomaseding's picture
Update readme
f99d377
|
raw
history blame
4.74 kB

https://huggingface.co/thomaseding/pixelnet

--- license: creativeml-openrail-m ---

PixelNet (Thomas Eding)

About:

PixelNet is a ControlNet model for Stable Diffusion.

It takes a checkerboard image as input, which is used to control where logical pixels are to be placed.

This is currently an experimental proof of concept. I trained this using on around 2000 pixel-art/pixelated images that I generated using Stable Diffusion (with a lot of cleanup and manual curation). The model is not very good, but it does work on grid sizes of about a max of 64 checker "pixels" for square generations. I did find that using 128x64 pattern still seemed to work moderately well for a 1024x512 image.

The model works best with the "Balanced" ControlNet setting. Try using a "Control Weight" of 1 or a little higher.

"ControlNet Is More Important" seems to require a heavy "Control Weight" setting to have an effect. Try using a "Control Weight" of 2.

Smaller checker grids tend to perform worse (e.g. 5x5 vs a 32x32)

Too low or too high of a "Steps" value breaks the model. Try something like 15-30, depending on an assortment of factors. Feel free to experiment with the built-in A1111 "X/Y/Z Plot" script.

Usage:

To install, copy the .safetensors and .yaml files to your Automatic1111 ControlNet extension's model directory (e.g. stable-diffusion-webui/extensions/sd-webui-controlnet/models). Completely restart the Automatic1111 server after doing this and then refresh the web page.

There is no preprocessor. Instead, supply a black and white checkerboard image as the control input. Examples are in the example-control-images directory of this repository. (https://huggingface.co/thomaseding/pixelnet/tree/main/example-control-images)

The script gen_checker.py can be used to generate checkerboard images of arbitrary sizes. (https://huggingface.co/thomaseding/pixelnet/blob/main/gen_checker.py) Example: python gen_checker.py --upscale-dims 512x512 --dims 70x70 --output-file control.png to generate a 70x70 checkerboard image upscaled to 512x512 pixels.

grid5x5

grid16x16

FAQ:

Q: Why is this needed? Can't I use a post-processor to downscale the image?

A: From my experience SD has a hard time creating genuine pixel art (even with dedicated base models and loras), where it has a mismatch of logical pixel sizes, smooth curves, etc. What appears to be a straight line at a glance, might bend around. This can cause post-processors to create artifacts based on quantization rounding a pixel to a position one pixel off in some direction. This model is intended to help fix that.

Q: Should I use this model with a post-processor?

A: Yes, I still recommend you do post-processing to clean up the image. This model is not perfect and will still have artifacts. Note that none of the sample output images are post-processed; they are raw outputs from the model. Consider sampling the image based on the location of the control grid checker faces. I will provide a custom script specialized for this in the near future.

Q: Does the model support non-square grids? A: Kind of. I trained it with some non-perfect square grids (when pre-upscaled checkerboards are not a factor of the upscaled image size), so in that sense it should work fine. I also trained it with some checkerboard images with genuine non-square rectangular faces (e.g. double-wide pixels).

Q: Will there be a better trained model of this in the future?

A: I hope so. I will need to curate a much larger and higher-quality dataset, which might take me a long time. Regardless, I plan on making the control effect more faithful to the control image. I may decide to try to generalize this beyond rectangular grids, but that is not a priority. I think including non-square rectangular faces in some of the training data was perhaps harmful to the model's performance. Likewise for grids smaller than 8x8. Perhaps it is better to train separate models for very small grids (but at that point, you might as well make the images by hand) and for non-square rectangular grids.

Q: What about color quantization?

A: Coming soon, "PaletteNet".

Sample Outputs:

sample1

sample2

sample3