Enhance Your Images Collection Some trending Gradio apps on Spaces that you can use to enhance/upscale your images for free. This collection will be kept uptodate with new releases. • 7 items • Updated Aug 22 • 17
Gradio Spaces for Background Removal Collection Enhance your images by removing the background. Will ensure these Spaces are up and maintained for the community. • 5 items • Updated Aug 20 • 23
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9 • 37
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 134
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 176
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation Paper • 2406.07686 • Published Jun 11 • 14
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published May 21 • 22
Paint by Inpaint: Learning to Add Image Objects by Removing Them First Paper • 2404.18212 • Published Apr 28 • 27
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published Apr 30 • 71
PuLID: Pure and Lightning ID Customization via Contrastive Alignment Paper • 2404.16022 • Published Apr 24 • 19
Edit Your Image! Collection Find all the trending and useful Gradio demos that you can use to edit your images. • 21 items • Updated Apr 26 • 23
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs Paper • 2404.16375 • Published Apr 25 • 16
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model Paper • 2403.05034 • Published Mar 8 • 20
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 182
ScreenAI: A Vision-Language Model for UI and Infographics Understanding Paper • 2402.04615 • Published Feb 7 • 36
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Paper • 2401.12070 • Published Jan 22 • 42
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data Paper • 2401.10891 • Published Jan 19 • 58
Improving fine-grained understanding in image-text pre-training Paper • 2401.09865 • Published Jan 18 • 15
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 47
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss Paper • 2401.02677 • Published Jan 5 • 21
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis Paper • 2312.13834 • Published Dec 20, 2023 • 26
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 257
Nice Gradio Chatbot UIs Collection The following Chatbot UIs or Projects have been created and are highly regarded by the community. • 4 items • Updated Dec 20, 2023 • 7
FreeInit: Bridging Initialization Gap in Video Diffusion Models Paper • 2312.07537 • Published Dec 12, 2023 • 26
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models Paper • 2312.06109 • Published Dec 11, 2023 • 20
Custom Components ✨ Collection Awesome gradio custom components to get you started build your own! • 7 items • Updated Nov 20, 2023 • 35
Readout Guidance: Learning Control from Diffusion Features Paper • 2312.02150 • Published Dec 4, 2023 • 3
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model Paper • 2311.09217 • Published Nov 15, 2023 • 21
SALMONN: Towards Generic Hearing Abilities for Large Language Models Paper • 2310.13289 • Published Oct 20, 2023 • 17
👐🏻Accessible🧱Gradio🦹🏻🦸🏻♀️Themes Collection This is a collection of gradio themes that conform to W3C's a11y color guidelines and recommendations • 13 items • Updated Oct 3, 2023 • 12
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent Paper • 2309.12311 • Published Sep 21, 2023 • 17
3D Gaussian Splatting for Real-Time Radiance Field Rendering Paper • 2308.04079 • Published Aug 8, 2023 • 168
SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension Paper • 2307.16125 • Published Jul 30, 2023 • 6
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 241
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation Paper • 2306.07954 • Published Jun 13, 2023 • 113
Agile Catching with Whole-Body MPC and Blackbox Policy Learning Paper • 2306.08205 • Published Jun 14, 2023 • 9