adipanda's picture
Model card auto-generated by SimpleTuner
8d377f4 verified
metadata
license: other
base_model: black-forest-labs/FLUX.1-dev
tags:
  - flux
  - flux-diffusers
  - text-to-image
  - diffusers
  - simpletuner
  - safe-for-work
  - lora
  - template:sd-lora
  - standard
inference: true
widget:
  - text: unconditional (blank prompt)
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_0_0.png
  - text: >-
      A scene from Chainsaw Man. Makima holding a sign that says 'I LOVE
      PROMPTS!', she is standing full body on a beach at sunset. She is wearing
      her white button-up shirt, black tie, and black trousers. The setting sun
      casts a dynamic shadow on her composed and enigmatic expression.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_1_0.png
  - text: >-
      A scene from Chainsaw Man. Makima jumping out of a propeller airplane, sky
      diving. Her expression remains calm and controlled, her red hair flowing
      in the wind. The sky is clear and blue, with birds flying in the distance.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_2_0.png
  - text: >-
      A scene from Chainsaw Man. Makima spinning a basketball on her finger on a
      basketball court. She is wearing a Lakers jersey with the #12 on it. The
      basketball hoop and cheering crowd are in the background. She has a
      composed and confident smile.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_3_0.png
  - text: >-
      A scene from Chainsaw Man. Makima is wearing a professional suit in an
      office, shaking the hand of a businesswoman. The woman has purple hair and
      is wearing formal attire. There is a Google logo in the background. It is
      during daytime, and the overall sentiment is one of achievement and
      authority.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_4_0.png
  - text: >-
      A scene from Chainsaw Man. Makima is fighting a large brown grizzly bear,
      deep in a forest. The bear is tall and standing on two legs, roaring. The
      bear is also wearing a crown because it is the king of all bears. Around
      them are tall trees and other animals watching intently.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_5_0.png

makima-standard-lora-1

This is a standard PEFT LoRA derived from black-forest-labs/FLUX.1-dev.

No validation prompt was used during training.

None

Validation settings

  • CFG: 3.5
  • CFG Rescale: 0.0
  • Steps: 20
  • Sampler: FlowMatchEulerDiscreteScheduler
  • Seed: 42
  • Resolution: 1024x1024
  • Skip-layer guidance:

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
unconditional (blank prompt)
Negative Prompt
blurry, cropped, ugly
Prompt
A scene from Chainsaw Man. Makima holding a sign that says 'I LOVE PROMPTS!', she is standing full body on a beach at sunset. She is wearing her white button-up shirt, black tie, and black trousers. The setting sun casts a dynamic shadow on her composed and enigmatic expression.
Negative Prompt
blurry, cropped, ugly
Prompt
A scene from Chainsaw Man. Makima jumping out of a propeller airplane, sky diving. Her expression remains calm and controlled, her red hair flowing in the wind. The sky is clear and blue, with birds flying in the distance.
Negative Prompt
blurry, cropped, ugly
Prompt
A scene from Chainsaw Man. Makima spinning a basketball on her finger on a basketball court. She is wearing a Lakers jersey with the #12 on it. The basketball hoop and cheering crowd are in the background. She has a composed and confident smile.
Negative Prompt
blurry, cropped, ugly
Prompt
A scene from Chainsaw Man. Makima is wearing a professional suit in an office, shaking the hand of a businesswoman. The woman has purple hair and is wearing formal attire. There is a Google logo in the background. It is during daytime, and the overall sentiment is one of achievement and authority.
Negative Prompt
blurry, cropped, ugly
Prompt
A scene from Chainsaw Man. Makima is fighting a large brown grizzly bear, deep in a forest. The bear is tall and standing on two legs, roaring. The bear is also wearing a crown because it is the king of all bears. Around them are tall trees and other animals watching intently.
Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

  • Training epochs: 249

  • Training steps: 3000

  • Learning rate: 0.0003

    • Learning rate schedule: constant
    • Warmup steps: 100
  • Max grad norm: 2.0

  • Effective batch size: 56

    • Micro-batch size: 56
    • Gradient accumulation steps: 1
    • Number of GPUs: 1
  • Gradient checkpointing: True

  • Prediction type: flow-matching (extra parameters=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])

  • Optimizer: adamw_bf16

  • Trainable parameter precision: Pure BF16

  • Caption dropout probability: 0.0%

  • LoRA Rank: 128

  • LoRA Alpha: None

  • LoRA Dropout: 0.1

  • LoRA initialisation style: default

Datasets

makima-512

  • Repeats: 2
  • Total number of images: 172
  • Total number of aspect buckets: 1
  • Resolution: 0.262144 megapixels
  • Cropped: False
  • Crop style: None
  • Crop aspect: None
  • Used for regularisation data: No

Inference

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'adipanda/makima-standard-lora-1'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."


## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.5,
).images[0]
image.save("output.png", format="PNG")