πΌοΈ Multimodal > At Hugging Face we released SmolVLM, a performant and efficient smol vision language model π > Show Lab released ShowUI-2B: new vision-language-action model to build GUI/web automation agents π€ > Rhymes AI has released the base model of Aria: Aria-Base-64K and Aria-Base-8K with their respective context length > ViDoRe team released ColSmolVLM: A new ColPali-like retrieval model based on SmolVLM > Dataset: Llava-CoT-o1-Instruct: new dataset labelled using Llava-CoT multimodal reasoning modelπ > Dataset: LLaVA-CoT-100k dataset used to train Llava-CoT released by creators of Llava-CoT π
π¬ LLMs > Qwen team released QwQ-32B-Preview, state-of-the-art open-source reasoning model, broke the internet π₯ > AliBaba has released Marco-o1, a new open-source reasoning model π₯ > NVIDIA released Hymba 1.5B Base and Instruct, the new state-of-the-art SLMs with hybrid architecture (Mamba + transformer)
β―οΈ Image/Video Generation > Qwen2VL-Flux: new image generation model based on Qwen2VL image encoder, T5 and Flux for generation > Lightricks released LTX-Video, a new DiT-based video generation model that can generate 24 FPS videos at 768x512 res β―οΈ > Dataset: Image Preferences is a new image generation preference dataset made with DIBT community effort of Argilla π·οΈ
Audio > OuteAI released OuteTTS-0.2-500M new multilingual text-to-speech model based on Qwen-2.5-0.5B trained on 5B audio prompt tokens
reacted to MonsterMMORPG's
post with π3 months ago
FLUX Tools Complete Tutorial with SwarmUI (as easy as Automatic1111 or Forge) : Outpainting, Inpainting, Redux Style Transfer + Re-Imagine + Combine Multiple Images, Depth and Canny - More info at the oldest comment - No-paywall : https://youtu.be/hewDdVJEqOQ
FLUX.1 Tools by BlackForestLabs changed the #AI field forever. They became the number 1 Open Source community provider after this massive release. In this tutorial, I will show you step by step how use FLUX.1 Fill model (inpainting model) to do perfect outpainting (yes this model used for outpainting) and inpainting. Moreover, I will show all features of FLUX Redux model to do style transfer / re-imagine 1 and more than 1 images combination. Furthermore, I will show you step by step how to convert input image into Depth or Canny maps and then how to use them on #FLUX Depth and Canny models. Both LoRA and full checkpoints of FLUX Depth and Canny.
Preparation of this tutorial took more than 1 week and this will be the very best and easiest to follow tutorial since it is made with famous #SwarmUI. SwarmUI is as easy and as advanced as Automatic1111 SD Web UI. Biggest advantage of SwarmUI is that, it uses ComfyUI as a back-end. Therefore, It is extremely fast, VRAM optimized and supports all of the newest SOTA models as soon as they are published.
So in this tutorial I will show you how to setup SwarmUI and FLUX Dev tools on your Windows Computer, Massed Compute, RunPod and Kaggle. I will step by step explanatin and show you every tips and tricks that you need to properly do style transfer, re-imagine, inpaint, outpaint, depth and canny with FLUX.