AELXKANG21
's Collections
any size diffusion
updated
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size
HD Images
Paper
•
2308.16582
•
Published
•
10
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture
Propagation
Paper
•
2310.13119
•
Published
•
11
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion
Prior
Paper
•
2310.16818
•
Published
•
30
Text-to-3D with classifier score distillation
Paper
•
2310.19415
•
Published
•
4
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and
Prediction
Paper
•
2310.20700
•
Published
•
9
Controlling Text-to-Image Diffusion by Orthogonal Finetuning
Paper
•
2306.07280
•
Published
•
20
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction
Model
Paper
•
2311.09217
•
Published
•
21
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View
Generation and 3D Diffusion
Paper
•
2311.07885
•
Published
•
39
Emu Video: Factorizing Text-to-Video Generation by Explicit Image
Conditioning
Paper
•
2311.10709
•
Published
•
24
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval
Score Matching
Paper
•
2311.11284
•
Published
•
16
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper
•
2311.12092
•
Published
•
21
Diffusion360: Seamless 360 Degree Panoramic Image Generation based on
Diffusion Models
Paper
•
2311.13141
•
Published
•
13
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper
•
2311.13600
•
Published
•
42
An Embodied Generalist Agent in 3D World
Paper
•
2311.12871
•
Published
•
8
LEDITS++: Limitless Image Editing using Text-to-Image Models
Paper
•
2311.16711
•
Published
•
22
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Paper
•
2312.00093
•
Published
•
14
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Paper
•
2311.18775
•
Published
•
6
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper
•
2312.00777
•
Published
•
21
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion
Models
Paper
•
2312.00079
•
Published
•
14
Segment and Caption Anything
Paper
•
2312.00869
•
Published
•
18
PyNeRF: Pyramidal Neural Radiance Fields
Paper
•
2312.00252
•
Published
•
8
DeepCache: Accelerating Diffusion Models for Free
Paper
•
2312.00858
•
Published
•
21
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
Local-Global Iterative Training
Paper
•
2312.01663
•
Published
•
3
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian
Splatting
Paper
•
2312.03461
•
Published
•
15
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Paper
•
2312.03209
•
Published
•
17
TokenCompose: Grounding Diffusion with Token-level Supervision
Paper
•
2312.03626
•
Published
•
5
HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces
Paper
•
2312.03160
•
Published
•
5
Context Diffusion: In-Context Aware Image Generation
Paper
•
2312.03584
•
Published
•
14
MotionCtrl: A Unified and Flexible Motion Controller for Video
Generation
Paper
•
2312.03641
•
Published
•
20
LooseControl: Lifting ControlNet for Generalized Depth Conditioning
Paper
•
2312.03079
•
Published
•
12
Customizing Motion in Text-to-Video Diffusion Models
Paper
•
2312.04966
•
Published
•
10
Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D
priors
Paper
•
2312.04963
•
Published
•
16
MVDD: Multi-View Depth Diffusion Models
Paper
•
2312.04875
•
Published
•
9
3D-LLM: Injecting the 3D World into Large Language Models
Paper
•
2307.12981
•
Published
•
36
DreaMoving: A Human Dance Video Generation Framework based on Diffusion
Models
Paper
•
2312.05107
•
Published
•
38
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World
Video Super-Resolution
Paper
•
2312.06640
•
Published
•
46
NeRFiller: Completing Scenes via Generative 3D Inpainting
Paper
•
2312.04560
•
Published
•
11
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point
Clouds Generation
Paper
•
2312.07231
•
Published
•
6
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper
•
2312.08128
•
Published
•
12
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Paper
•
2312.07661
•
Published
•
16
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D
Generation
Paper
•
2312.08754
•
Published
•
6
VideoLCM: Video Latent Consistency Model
Paper
•
2312.09109
•
Published
•
22
Mosaic-SDF for 3D Generative Models
Paper
•
2312.09222
•
Published
•
15
Holodeck: Language Guided Generation of 3D Embodied AI Environments
Paper
•
2312.09067
•
Published
•
13
FineControlNet: Fine-level Text Control for Image Generation with
Spatially Aligned Text Control Injection
Paper
•
2312.09252
•
Published
•
9
Pixel Aligned Language Models
Paper
•
2312.09237
•
Published
•
14
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
Paper
•
2312.09246
•
Published
•
5
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained
Geometry and Appearance
Paper
•
2312.08889
•
Published
•
11
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
Models
Paper
•
2312.09608
•
Published
•
13
Stable Score Distillation for High-Quality 3D Generation
Paper
•
2312.09305
•
Published
•
7
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language
Models
Paper
•
2312.04533
•
Published
•
1
Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint
Method
Paper
•
2312.12030
•
Published
•
4
TIP: Text-Driven Image Processing with Semantic and Restoration
Instructions
Paper
•
2312.11595
•
Published
•
5
Tracking Any Object Amodally
Paper
•
2312.12433
•
Published
•
11
MixRT: Mixed Neural Representations For Real-Time NeRF Rendering
Paper
•
2312.11841
•
Published
•
10
Customize-It-3D: High-Quality 3D Creation from A Single Image Using
Subject-Specific Knowledge Prior
Paper
•
2312.11535
•
Published
•
6
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning
Paper
•
2312.11461
•
Published
•
18
Repaint123: Fast and High-quality One Image to 3D Generation with
Progressive Controllable 2D Repainting
Paper
•
2312.13271
•
Published
•
4
SpecNeRF: Gaussian Directional Encoding for Specular Reflections
Paper
•
2312.13102
•
Published
•
5
InstructVideo: Instructing Video Diffusion Models with Human Feedback
Paper
•
2312.12490
•
Published
•
17
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive
Generation
Paper
•
2312.12491
•
Published
•
69
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper
•
2312.13150
•
Published
•
14
UniSDF: Unifying Neural Representations for High-Fidelity 3D
Reconstruction of Complex Scenes with Reflections
Paper
•
2312.13285
•
Published
•
5
Model-Based Control with Sparse Neural Dynamics
Paper
•
2312.12791
•
Published
•
5
LASA: Instance Reconstruction from Real Scans using A Large-scale
Aligned Shape Annotation Dataset
Paper
•
2312.12418
•
Published
•
2
Neural feels with neural fields: Visuo-tactile perception for in-hand
manipulation
Paper
•
2312.13469
•
Published
•
10
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Paper
•
2312.13913
•
Published
•
22
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image
Inpainting with Diffusion Models
Paper
•
2312.14091
•
Published
•
15
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs
Paper
•
2312.14140
•
Published
•
6
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Paper
•
2312.13834
•
Published
•
26
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion
Models with RL Finetuning
Paper
•
2312.13980
•
Published
•
13
ControlRoom3D: Room Generation using Semantic Proxy Rooms
Paper
•
2312.05208
•
Published
•
8
DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular
Video
Paper
•
2312.13528
•
Published
•
6
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View
Synthesis
Paper
•
2312.13016
•
Published
•
6
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors
Paper
•
2312.13324
•
Published
•
9
MACS: Mass Conditioned 3D Hand and Object Motion Synthesis
Paper
•
2312.14929
•
Published
•
4
LangSplat: 3D Language Gaussian Splatting
Paper
•
2312.16084
•
Published
•
14
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Paper
•
2312.16457
•
Published
•
13
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View
Synthesis
Paper
•
2312.16812
•
Published
•
9
Restoration by Generation with Constrained Priors
Paper
•
2312.17161
•
Published
•
3
DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption
by Combining 3D GANs and Diffusion Priors
Paper
•
2312.16837
•
Published
•
5
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with
Time-Decoupled Training and Reusable Coop-Diffusion
Paper
•
2312.16486
•
Published
•
6
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
Paper
•
2312.16693
•
Published
•
14
Prompt Expansion for Adaptive Text-to-Image Generation
Paper
•
2312.16720
•
Published
•
5
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D
Synthetic Data
Paper
•
2401.01173
•
Published
•
11
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Paper
•
2401.01647
•
Published
•
12
Efficient Hybrid Zoom using Camera Fusion on Mobile Phones
Paper
•
2401.01461
•
Published
•
7
Instruct-Imagen: Image Generation with Multi-modal Instruction
Paper
•
2401.01952
•
Published
•
31
Denoising Vision Transformers
Paper
•
2401.02957
•
Published
•
28
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for
Open-vocabulary 3D Object Detection
Paper
•
2310.02960
•
Published
•
1
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Paper
•
2401.04468
•
Published
•
48
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
Paper
•
2401.04092
•
Published
•
21
ODIN: A Single Model for 2D and 3D Perception
Paper
•
2401.02416
•
Published
•
11
PIXART-δ: Fast and Controllable Image Generation with Latent
Consistency Models
Paper
•
2401.05252
•
Published
•
47
URHand: Universal Relightable Hands
Paper
•
2401.05334
•
Published
•
22
InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes
Paper
•
2401.05335
•
Published
•
27
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Paper
•
2312.04461
•
Published
•
58
HexaGen3D: StableDiffusion is just one step away from Fast and Diverse
Text-to-3D Generation
Paper
•
2401.07727
•
Published
•
9
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation
Paper
•
2401.08559
•
Published
•
8
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper
•
2401.10061
•
Published
•
29
VMamba: Visual State Space Model
Paper
•
2401.10166
•
Published
•
38
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper
•
2401.09962
•
Published
•
8
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
Interpolant Transformers
Paper
•
2401.08740
•
Published
•
12
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning
Capabilities
Paper
•
2401.12168
•
Published
•
26
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
Generating with Multimodal LLMs
Paper
•
2401.11708
•
Published
•
30
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper
•
2401.10891
•
Published
•
60
Fast Registration of Photorealistic Avatars for VR Facial Animation
Paper
•
2401.11002
•
Published
•
2
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper
•
2401.14257
•
Published
•
10
SPAD : Spatially Aware Multiview Diffusers
Paper
•
2402.05235
•
Published
•
3
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality
3D Generation
Paper
•
2402.08682
•
Published
•
12
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Paper
•
2402.10210
•
Published
•
32
CityDreamer: Compositional Generative Model of Unbounded 3D Cities
Paper
•
2309.00610
•
Published
•
18
Multistep Consistency Models
Paper
•
2403.06807
•
Published
•
14
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
104
Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field
Representation and Generation
Paper
•
2403.19319
•
Published
•
12
FruitNeRF: A Unified Neural Radiance Field based Fruit Counting
Framework
Paper
•
2408.06190
•
Published
•
17
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed
Representations
Paper
•
2408.12590
•
Published
•
35
Towards Realistic Example-based Modeling via 3D Gaussian Stitching
Paper
•
2408.15708
•
Published
•
7
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world
Videos
Paper
•
2409.02095
•
Published
•
35
Open-MAGVIT2: An Open-Source Project Toward Democratizing
Auto-regressive Visual Generation
Paper
•
2409.04410
•
Published
•
23
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video
Diffusion Models
Paper
•
2409.07452
•
Published
•
20
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Paper
•
2409.11355
•
Published
•
28
Portrait Video Editing Empowered by Multimodal Generative Priors
Paper
•
2409.13591
•
Published
•
15
Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Paper
•
2412.03079
•
Published
•
2