|
--- |
|
license: creativeml-openrail-m |
|
tags: |
|
- coreml |
|
- stable-diffusion |
|
- text-to-image |
|
--- |
|
# ControlNet v1.1 Models And Compatible Stable Diffusion v1.5 Type Models Converted To Apple CoreML Format |
|
|
|
## For use with a Swift app or the SwiftCLI |
|
|
|
The SD models are all "Original" (not "Split-Einsum") and built for CPU and GPU. They are each for the output size noted. They are fp16, with the standard SD-1.5 VAE embedded. |
|
|
|
The Stable Diffusion v1.5 model and the other SD 1.5 type models contain both the standard Unet and the ControlledUnet used for a ControlNet pipeline. The correct one will be used automatically based on whether a ControlNet is enabled or not. |
|
|
|
They have VAEEncoder.mlmodelc bundles that allow Image2Image to operate correctly at the noted resolutions, when used with a current Swift CLI pipeline or a current GUI built with ml-stable-diffusion 0.4.0, such as Mochi Diffusion 3.2, 4.0, or later. |
|
|
|
All of the ControlNet models are "Original" ones, built for CPU and GPU compute units (cpuAndGPU) and for SD-1.5 type models. They will not work with SD-2.1 type models. The zip files each have a set of models at 4 resolutions. The 512x512 builds appear to also work with "Split-Einsum" models, using CPU and GPU (cpuAmdGPU), but from my tests, they will not work with "Split-Einsum" models when using the Neural Engine (NE). |
|
|
|
All of the models in this repo work with Swift and the current apple/ml-stable-diffusion pipeline release (0.4.0). They were not built for, and will not work with, a Python Diffusers pipeline. They need ml-stable-diffusion (https://github.com/apple/ml-stable-diffusion) for command line use, or a Swift app that supports ControlNet, such as the new (June 2023) Mochi Diffusion 4.0 version (https://github.com/godly-devotion/MochiDiffusion). |
|
|
|
The full SD models are in the "SD" folder of this repo. They are in subfolders by model name and individually zipped for a particular resolution. They need to be unzipped for use after downloading. |
|
|
|
The ControlNet model files are in the "CN" folder of this repo. They are zipped and need to be unzipped after downloading. Each zip holds a set of 4 resolutions for that ControlNet type, built for 512x512, 512x768, 768x512 and 768x768. |
|
|
|
There is also a "MISC" folder that has text files with some notes and a screencap of my directory structure. These are provided for those who want to convert models themselves and/or run the models with a SwiftCLI. The notes are not perfect, and may be out of date if any of the Python or CoreML packages referenced have been updated recently. You can open a Discussion here if you need help with any of the "MISC" items. |
|
|
|
For command line use, the "MISC" notes cover setting up a miniconda3 environment. If you are using the command line, please read the notes concerning naming and placement of your ControlNet model folder. |
|
|
|
If you are using a GUI like Mochi Diffusion 4.0, the app will most likely guide you to the correct location/arrangement for your ConrolNet model folder. |
|
|
|
The sizes noted for all model type inputs/outputs are WIDTH x HEIGHT. A 512x768 is "portrait" orientation and a 768x512 is "landscape" orientation. |
|
|
|
**If you encounter any models that do not work correctly with image2image and/or a ControlNet, using the current apple/ml-stable-diffusion SwiftCLI pipeline for i2i or CN, or Mochi Diffusion 3.2 using i2i, or Mochi Diffusion 4.0 using i2i or CN, please leave a report in the Community Discussion area. If you would like to add models that you have converted, leave a message there as well, and I'll grant you access to this repo.** |
|
|
|
## Base Models - A Variety Of SD-1.5-Type Models For Use With ControlNet |
|
Each folder contains 4 zipped model files, output sizes as indicated: 512x512, 512x768, 768x512 or 768x768 |
|
- DreamShaper v5.0, 1.5-type model, "Original" |
|
- GhostMix v1.1, 1.5-type anime model, "Original" |
|
- MeinaMix v9.0 1.5-type anime model, "Original" |
|
- MyMerge v1.0 1.5-type NSFW model, "Original" |
|
- Realistic Vision v2.0, 1.5-type model, "Original" |
|
- Stable Diffusion v1.5, "Original" |
|
|
|
## ControlNet Models - All Current SD-1.5-Type ControlNet Models |
|
Each zip file contains a set of 4 resolutions: 512x512, 512x768, 768x512 and 768x768 |
|
- Canny -- Edge Detection, Outlines As Input |
|
- Depth -- Reproduces Depth Relationships From An Image |
|
- InPaint -- Use Masks To Define And Modify An Area (not sure how this works) |
|
- InstrP2P -- Instruct Pixel2Pixel - "Change X to Y" |
|
- LineAnime -- Find And Reuse Small Outlines, Optimized For Anime |
|
- LineArt -- Find And Reuse Small Outlines |
|
- MLSD -- Find And Reuse Straight Lines And Edges |
|
- NormalBAE -- Reproduce Depth Relationships Using Surface Normal Depth Maps |
|
- OpenPose -- Copy Body Poses |
|
- Scribble -- Freehand Sketch As Input |
|
- Segmentation -- Find And Reuse Distinct Areas |
|
- Shuffle -- Find And Reorder Major Elements |
|
- SoftEdge -- Find And Reuse Soft Edges |
|
- Tile -- Subtle Variations Within Batch Runs |
|
|