metadata

license: creativeml-openrail-m
tags:
  - coreml
  - stable-diffusion
  - text-to-image

ControlNet v1.1 Models And Compatible Stable Diffusion v1.5 Type Models Converted To Apple CoreML Format

For use with a Swift app or the SwiftCLI

The SD models are all "Original" (not "Split-Einsum") and built for CPU and GPU. They are each for the output size noted. They are fp16, with the standard SD-1.5 VAE embedded.

The Stable Diffusion v1.5 model and the other SD 1.5 type models contain both the standard Unet and the ControlledUnet used for a ControlNet pipeline. The correct one will be used automatically based on whether a ControlNet is enabled or not.

They have VAEEncoder.mlmodelc bundles that allow Image2Image to operate correctly at the noted resolutions, when used with a current Swift CLI pipeline or a current GUI built with ml-stable-diffusion 0.4.0, such as Mochi Diffusion 3.2, 4.0, or later.

All of the ControlNet models are "Original" ones, built for CPU and GPU compute units (cpuAndGPU) and for SD-1.5 type models. They will not work with SD-2.1 type models. The zip files each have a set of models at 4 resolutions. The 512x512 builds appear to also work with "Split-Einsum" models, using CPU and GPU (cpuAmdGPU), but from my tests, they will not work with "Split-Einsum" models when using the Neural Engine (NE).

All of the models in this repo work with Swift and the current apple/ml-stable-diffusion pipeline release (0.4.0). They were not built for, and will not work with, a Python Diffusers pipeline. They need ml-stable-diffusion (https://github.com/apple/ml-stable-diffusion) for command line use, or a Swift app that supports ControlNet, such as the new (June 2023) Mochi Diffusion 4.0 version (https://github.com/godly-devotion/MochiDiffusion).

The full SD models are in the "SD" folder of this repo. They are in subfolders by model name and individually zipped for a particular resolution. They need to be unzipped for use after downloading.

The ControlNet model files are in the "CN" folder of this repo. They are zipped and need to be unzipped after downloading. Each zip holds a set of 4 resolutions for that ControlNet type, built for 512x512, 512x768, 768x512 and 768x768.

There is also a "MISC" folder that has text files with some notes and a screencap of my directory structure. These are provided for those who want to convert models themselves and/or run the models with a SwiftCLI. The notes are not perfect, and may be out of date if any of the Python or CoreML packages referenced have been updated recently. You can open a Discussion here if you need help with any of the "MISC" items.

For command line use, the "MISC" notes cover setting up a miniconda3 environment. If you are using the command line, please read the notes concerning naming and placement of your ControlNet model folder.

If you are using a GUI like Mochi Diffusion 4.0, the app will most likely guide you to the correct location/arrangement for your ConrolNet model folder. Please note that when you unzip the ControlNet files from this repo, they will unzip into a folder with the actual model files inside. This folder is just for the zipping process. What you want to move into your CpntrolNet model folder in Mochi Diffusion will be the individual files, not the folder they unzip into. This is different from base models, where you do want to copy the folder itself. See the images here and here for an example of how my folders are set up.

The sizes noted for all model type inputs/outputs are WIDTH x HEIGHT. A 512x768 is "portrait" orientation and a 768x512 is "landscape" orientation.

If you encounter any models that do not work correctly with image2image and/or a ControlNet, using the current apple/ml-stable-diffusion SwiftCLI pipeline for i2i or CN, or Mochi Diffusion 3.2 using i2i, or Mochi Diffusion 4.0 using i2i or CN, please leave a report in the Community Discussion area. If you would like to add models that you have converted, leave a message there as well, and I'll grant you access to this repo.

Base Models - A Variety Of SD-1.5-Type Models For Use With ControlNet

Each folder contains 4 zipped model files, output sizes as indicated: 512x512, 512x768, 768x512 or 768x768

DreamShaper v5.0, 1.5-type model, "Original"
GhostMix v1.1, 1.5-type anime model, "Original"
MeinaMix v9.0 1.5-type anime model, "Original"
MyMerge v1.0 1.5-type NSFW model, "Original"
Realistic Vision v2.0, 1.5-type model, "Original"
Stable Diffusion v1.5, "Original"

ControlNet Models - All Current SD-1.5-Type ControlNet Models

Each zip file contains a set of 4 resolutions: 512x512, 512x768, 768x512 and 768x768

Canny -- Edge Detection, Outlines As Input
Depth -- Reproduces Depth Relationships From An Image
InPaint -- Use Masks To Define And Modify An Area (not sure how this works)
InstrP2P -- Instruct Pixel2Pixel - "Change X to Y"
LineAnime -- Find And Reuse Small Outlines, Optimized For Anime
LineArt -- Find And Reuse Small Outlines
MLSD -- Find And Reuse Straight Lines And Edges
NormalBAE -- Reproduce Depth Relationships Using Surface Normal Depth Maps
OpenPose -- Copy Body Poses
Scribble -- Freehand Sketch As Input
Segmentation -- Find And Reuse Distinct Areas
Shuffle -- Find And Reorder Major Elements
SoftEdge -- Find And Reuse Soft Edges
Tile -- Subtle Variations Within Batch Runs