STIV: Scalable Text and Image Conditioned Video Generation Paper • 2412.07730 • Published 14 days ago • 69
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 6 days ago • 70
Arbitrary-steps Image Super-resolution via Diffusion Inversion Paper • 2412.09013 • Published 13 days ago • 11
view article Article Building an AI-powered search engine from scratch By as-cle-bert • 13 days ago • 8
view article Article Financial Analysis with Langchain and CrewAI Agents By herooooooooo • Jun 30 • 8
Flux.1 Tools Collection FLUX.1 Tools, a suite of models designed to add control and steerability to base text-to-image models FLUX.1 • 6 items • Updated Nov 22 • 13
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching Paper • 2407.03648 • Published Jul 4 • 17
Enhance Your Images Collection Some trending Gradio apps on Spaces that you can use to enhance/upscale your images for free. This collection will be kept uptodate with new releases. • 7 items • Updated Aug 22 • 17
Gradio Spaces for Background Removal Collection Enhance your images by removing the background. Will ensure these Spaces are up and maintained for the community. • 5 items • Updated Aug 20 • 23
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9 • 38
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 182
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 180
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation Paper • 2406.07686 • Published Jun 11 • 14