Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.17405

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10 • 64
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10 • 23
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Paper • 2406.04271 • Published Jun 6 • 27
Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4 • 36

Video-Gen Diffusion_4D(DiT etc)

Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

Paper • 2405.17405 • Published May 27 • 14
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29 • 12
4Diffusion: Multi-view Video Diffusion Model for 4D Generation

Paper • 2405.20674 • Published May 31 • 11
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models

Paper • 2406.07472 • Published Jun 11 • 10

Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

Paper • 2405.17405 • Published May 27 • 14

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Paper • 2403.01807 • Published Mar 4 • 7
TripoSR: Fast 3D Object Reconstruction from a Single Image

Paper • 2403.02151 • Published Mar 4 • 12
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4 • 28
MagicClay: Sculpting Meshes With Generative Neural Fields

Paper • 2403.02460 • Published Mar 4 • 6

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27 • 188
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior

Paper • 2312.01841 • Published Dec 4, 2023 • 1
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Paper • 2311.16498 • Published Nov 27, 2023 • 1
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

Paper • 2312.02134 • Published Dec 4, 2023 • 2

Seamless Human Motion Composition with Blended Positional Encodings

Paper • 2402.15509 • Published Feb 23 • 14
TripoSR: Fast 3D Object Reconstruction from a Single Image

Paper • 2403.02151 • Published Mar 4 • 12
3D-VLA: A 3D Vision-Language-Action Generative World Model

Paper • 2403.09631 • Published Mar 14 • 7
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting

Paper • 2403.09981 • Published Mar 15 • 6

MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44
A Touch, Vision, and Language Dataset for Multimodal Alignment

Paper • 2402.13232 • Published Feb 20 • 13
Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20 • 94
FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Paper • 2402.13251 • Published Feb 20 • 13

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18 • 14
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18 • 7
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19 • 13

3D Reconstruction

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors

Paper • 2312.16837 • Published Dec 28, 2023 • 5
Learning the 3D Fauna of the Web

Paper • 2401.02400 • Published Jan 4 • 9
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model

Paper • 2310.15110 • Published Oct 23, 2023 • 2
Zero-1-to-3: Zero-shot One Image to 3D Object

Paper • 2303.11328 • Published Mar 20, 2023 • 5

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 113
TryOnDiffusion: A Tale of Two UNets

Paper • 2306.08276 • Published Jun 14, 2023 • 72
Seeing the World through Your Eyes

Paper • 2306.09348 • Published Jun 15, 2023 • 32

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs