BRIA-4B-Adapt / README.md
animrods's picture
Update README.md
5339e27 verified
metadata
license: other
license_name: bria-t2i
license_link: https://bria.ai/customer-general-terms-and-conditions
library_name: diffusers
inference: false
tags:
  - text-to-image
  - legal liability
  - commercial use
extra_gated_description: >-
  Model weights from BRIA AI can be obtained with the purchase of a commercial
  license. Fill in the form below and we reach out to you. Need API Access? Get
  it [here](https://platform.bria.ai/console/api/image-generation) (1K Monthly
  Free API Calls). Startup or a student? Get access by applying for our [Startup
  Program](https://pages.bria.ai/the-visual-generative-ai-platform-for-builders-startups-plan?_gl=1*cqrl81*_ga*MTIxMDI2NzI5OC4xNjk5NTQ3MDAz*_ga_WRN60H46X4*MTcwOTM5OTMzNC4yNzguMC4xNzA5Mzk5MzM0LjYwLjAuMA..)
extra_gated_heading: Fill in this form to request a commercial license for the model
extra_gated_fields:
  Name: text
  Company/Org name: text
  Org Type (Early/Growth Startup, Enterprise, Academy): text
  Role: text
  Country: text
  Email: text
  By submitting this form, I agree to BRIA’s Privacy policy and Terms & conditions, see links below: checkbox

BRIA-4B-Adapt: Fine-tune oriented Text-to-Image Model for Commercial Licensing

BRIA-4B-Adapt is our new groundbreaking 4 billion parameters text-to-image model, explicitly designed to provide exceptional fine-tuning capabilities for commercial use. The model excels in aligning to the tuned style while preserving an remarkably high prompt alignment. This model combines technological innovation with ethical responsibility and legal security, setting a new standard in the AI industry. Bria AI licenses the foundation model with full legal liability coverage. Our dataset does not contain copyrighted materials, such as fictional characters, logos, trademarks, public figures, harmful content, or privacy-infringing content.

For more information, please visit our website.

Join our Discord community for more information, tutorials, tools, and to connect with other users!

Get Access

Interested in BRIA-4B-Adapt? Purchase is required to license and access BRIA-4B-Adapt, ensuring royalty management with our data partners and full liability coverage for commercial use.

Are you a startup or a student? We encourage you to apply for our Startup Program to request access. This program are designed to support emerging businesses and academic pursuits with our cutting-edge technology.

Contact us today to unlock the potential of BRIA-4B-Adapt! By submitting the form above, you agree to BRIA’s Privacy policy and Terms & conditions.

Key Features

  • Legally Compliant: Offers full legal liability coverage for copyright and privacy infringements. Thanks to training on 100% licensed data from leading data partners, we ensure the ethical use of content.

  • Patented Attribution Engine: Our attribution engine is our way to compensate our data partners, powered by our proprietary and patented algorithms.

  • Enterprise-Ready: Specifically designed for business applications, Bria-4B-Adapt delivers high-quality fine-tuning capabilities for generating compliant imagery for a variety of commercial needs.

  • Customizable Technology: Provides access to source code and weights for extensive customization, catering to specific business requirements.

  • Fully-Automated: Provides access to fully no-code automatic fine-tuning capabilities on Bria's platform: https://platform.bria.ai/console/tailored-generation.

Model Description

  • Developed by: BRIA AI

  • Model type: Latent Flow-Matching Text-to-Image Model

  • License: Commercial licensing terms & conditions.

  • Purchase is required to license and access the model.

  • Model Description: BRIA-4B-Adapt is a text-to-image model trained exclusively on a professional-grade, licensed dataset. It is designed for commercial use and includes full legal liability coverage.

  • Resources for more information: BRIA AI

Usage

Installations

pip install -qr https://huggingface.co/briaai/BRIA-4B-Adapt/resolve/main/requirements.txt
from huggingface_hub import hf_hub_download
import os

try:
    local_dir = os.path.dirname(__file__)
except:
    local_dir = '.'
    
hf_hub_download(repo_id="briaai/BRIA-4B-Adapt", filename='pipeline_bria.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-4B-Adapt", filename='transformer_bria.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-4B-Adapt", filename='bria_utils.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-4B-Adapt", filename='train_lora.py', local_dir=local_dir)

Training a new LoRA

To Fine-Tune a new LoRA on top of BRIA-4B-Adapt, use the provided training script (based on: https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_flux.py). Fine-tuning using LoRA could be useful in use cases such as: teaching the model to generate a specific cartoon character or a certain style for a scene, object, characters, icons, etc.

In the following example we train a LoRA for a character of a cartoon bear, using the data provided under "example_finetune_data", containing some images of this character and a csv file with captions:

training_data

python train_lora.py \
    --pretrained_model_name_or_path briaai/BRIA-4B-Adapt \
    --dataset_name example_finetune_data/ \
    --output_dir example_output_lora/ \
    --max_train_steps 1500 \
    --rank 128 \
    --train_batch_size 1 \
    --gradient_accumulation_steps 4

Some tips for training:

  • Image variety: use images maintaining consistency in the visual element you wish the model to learn, but varied enough so that the model could generalize across this domain.
  • Image resolution: image resolution should be at least 1024x1024 or similar. By default, images would be resized and center-cropped to 1024x1024. Control this with the arguments "center_crop" and "resolution".
  • Captions should contain a description of each image’s unique content, but it's advisable to use a constant description of the visual domain (e.g. "An illustration of a cute brown bear") or to use a unique "trigger-word" ("a character named Briabear"). The caption should have less than 128 tokens (~100 words).
  • Training hyperparameters to consider:
    • "rank": default is 128 but lower ranks can suffice for simple use cases. Try increasing the rank to teach the model finer details.
    • "max_train_steps": how many steps the model will train for. Consdier training for more steps and evaluate different checkpoints to choose the best one.
    • "optimizer": we use "prodigy" by default with a "learning_rate" of 1, as suggested in: https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_flux.md

Generating an image using a trained LoRA

Once your LoRA is trained, you can load it into the Bria Pipeline to generate new images in the domain trained on. In the example below we load the LoRA trained on the Bria-Bear character and generate a new image of this character.

import torch
from pipeline_bria import BriaPipeline

# trust_remote_code = True - allows loading a transformer which is not present at the transformers library(from transformer/bria_transformer.py)
pipe = BriaPipeline.from_pretrained("briaai/BRIA-4B-Adapt", torch_dtype=torch.bfloat16,trust_remote_code=True)
pipe.load_lora_weights("briaai/BRIA-4B-Adapt", subfolder="example_finetuned_model", weight_name = "pytorch_lora_weights.safetensors")
pipe.to(device="cuda")

prompt = "An illustration of a character named Briabear, a cute brown bear, wearing a purple bowtie and a purple top hat, colorful striped background."
negative_prompt = "Logo,Watermark,Text,Ugly,Morbid,Extra fingers,Poorly drawn hands,Mutation,Blurry,Extra limbs,Gross proportions,Missing arms,Mutated hands,Long neck,Duplicate,Mutilated,Mutilated hands,Poorly drawn face,Deformed,Bad anatomy,Cloned face,Malformed limbs,Missing legs,Too many fingers"

images = pipe(prompt=prompt, negative_prompt=negative_prompt, height=1024, width=1024).images[0]

example_generation

Some tips for using our text-to-image model at inference:

  1. Using negative prompt could be useful in some cases.
  2. We support multiple aspect ratios, yet resolution should overall consists approximately 1024*1024=1M pixels, for example: ((1024,1024), (1280, 768), (1344, 768), (832, 1216), (1152, 832), (1216, 832), (960,1088).
  3. Use 30-50 steps (higher is better)
  4. Use guidance_scale of 5.0