metadata

license: other
base_model: black-forest-labs/FLUX.1-dev
tags:
  - flux
  - flux-diffusers
  - text-to-image
  - diffusers
  - simpletuner
  - safe-for-work
  - lora
  - template:sd-lora
  - standard
inference: true
widget:
  - text: unconditional (blank prompt)
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_0_0.png
  - text: >-
      A scene from Chainsaw Man. Makima holding a sign that says 'I LOVE
      PROMPTS!', she is standing full body on a beach at sunset. She is wearing
      her white button-up shirt, black tie, and black trousers. The setting sun
      casts a dynamic shadow on her composed and enigmatic expression.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_1_0.png
  - text: >-
      A scene from Chainsaw Man. Makima jumping out of a propeller airplane, sky
      diving. Her expression remains calm and controlled, her red hair flowing
      in the wind. The sky is clear and blue, with birds flying in the distance.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_2_0.png
  - text: >-
      A scene from Chainsaw Man. Makima spinning a basketball on her finger on a
      basketball court. She is wearing a Lakers jersey with the #12 on it. The
      basketball hoop and cheering crowd are in the background. She has a
      composed and confident smile.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_3_0.png
  - text: >-
      A scene from Chainsaw Man. Makima is wearing a professional suit in an
      office, shaking the hand of a businesswoman. The woman has purple hair and
      is wearing formal attire. There is a Google logo in the background. It is
      during daytime, and the overall sentiment is one of achievement and
      authority.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_4_0.png
  - text: >-
      A scene from Chainsaw Man. Makima is fighting a large brown grizzly bear,
      deep in a forest. The bear is tall and standing on two legs, roaring. The
      bear is also wearing a crown because it is the king of all bears. Around
      them are tall trees and other animals watching intently.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_5_0.png

makima-standard-lora-1

This is a standard PEFT LoRA derived from black-forest-labs/FLUX.1-dev.

No validation prompt was used during training.

None

Validation settings

CFG: 3.5
CFG Rescale: 0.0
Steps: 20
Sampler: FlowMatchEulerDiscreteScheduler
Seed: 42
Resolution: 1024x1024
Skip-layer guidance:

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
unconditional (blank prompt)

Negative Prompt
blurry, cropped, ugly

Prompt
A scene from Chainsaw Man. Makima holding a sign that says 'I LOVE PROMPTS!', she is standing full body on a beach at sunset. She is wearing her white button-up shirt, black tie, and black trousers. The setting sun casts a dynamic shadow on her composed and enigmatic expression.

Negative Prompt
blurry, cropped, ugly

Prompt
A scene from Chainsaw Man. Makima jumping out of a propeller airplane, sky diving. Her expression remains calm and controlled, her red hair flowing in the wind. The sky is clear and blue, with birds flying in the distance.

Negative Prompt
blurry, cropped, ugly

Prompt
A scene from Chainsaw Man. Makima spinning a basketball on her finger on a basketball court. She is wearing a Lakers jersey with the #12 on it. The basketball hoop and cheering crowd are in the background. She has a composed and confident smile.

Negative Prompt
blurry, cropped, ugly

Prompt
A scene from Chainsaw Man. Makima is wearing a professional suit in an office, shaking the hand of a businesswoman. The woman has purple hair and is wearing formal attire. There is a Google logo in the background. It is during daytime, and the overall sentiment is one of achievement and authority.

Negative Prompt
blurry, cropped, ugly

Prompt
A scene from Chainsaw Man. Makima is fighting a large brown grizzly bear, deep in a forest. The bear is tall and standing on two legs, roaring. The bear is also wearing a crown because it is the king of all bears. Around them are tall trees and other animals watching intently.

Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Training epochs: 249
Training steps: 3000
Learning rate: 0.0003
- Learning rate schedule: constant
- Warmup steps: 100
Max grad norm: 2.0
Effective batch size: 56
- Micro-batch size: 56
- Gradient accumulation steps: 1
- Number of GPUs: 1
Gradient checkpointing: True
Prediction type: flow-matching (extra parameters=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])
Optimizer: adamw_bf16
Trainable parameter precision: Pure BF16
Caption dropout probability: 0.0%
LoRA Rank: 128
LoRA Alpha: None
LoRA Dropout: 0.1
LoRA initialisation style: default

Datasets

makima-512

Repeats: 2
Total number of images: 172
Total number of aspect buckets: 1
Resolution: 0.262144 megapixels
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

Inference

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'adipanda/makima-standard-lora-1'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."


## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.5,
).images[0]
image.save("output.png", format="PNG")