Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.10122

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6 • 71
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29 • 48
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Paper • 2311.10122 • Published Nov 16, 2023 • 26
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

Paper • 2311.16103 • Published Nov 27, 2023 • 1

Falah/3d-birds_animals_prompts

Viewer • Updated Sep 10, 2023 • 100k
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

Paper • 2305.03043 • Published May 4, 2023 • 5
nvidia/HelpSteer

Viewer • Updated Jun 24 • 37.1k • 1.17k • 213
gradio/custom-component-gallery-backups

Viewer • Updated May 28 • 57 • 6 • 4

Video Understanding

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 19
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Paper • 2310.05863 • Published Oct 9, 2023 • 1
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 79
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

Paper • 2311.10126 • Published Nov 16, 2023 • 7

Multimodal/Vision LLMs

GLaMM: Pixel Grounding Large Multimodal Model

Paper • 2311.03356 • Published Nov 6, 2023 • 33
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Paper • 2311.03354 • Published Nov 6, 2023 • 4
CogVLM: Visual Expert for Pretrained Language Models

Paper • 2311.03079 • Published Nov 6, 2023 • 23
UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Paper • 2311.10125 • Published Nov 16, 2023 • 4

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 596
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 157
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47
Don't Make Your LLM an Evaluation Benchmark Cheater

Paper • 2311.01964 • Published Nov 3, 2023 • 1

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs