VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping Paper • 2412.11279 • Published 10 days ago • 12
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 13 days ago • 21
Can LLMs be Good Graph Judger for Knowledge Graph Construction? Paper • 2411.17388 • Published 30 days ago
JourneyDB: A Benchmark for Generative Image Understanding Paper • 2307.00716 • Published Jul 3, 2023 • 19
Question Answering as Programming for Solving Time-Sensitive Questions Paper • 2305.14221 • Published May 23, 2023
AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models Paper • 2308.06507 • Published Aug 12, 2023 • 1
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast Paper • 2405.14507 • Published May 23
GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars Paper • 2408.13674 • Published Aug 24 • 18
RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection Paper • 2110.12130 • Published Oct 23, 2021
MoVA: Adapting Mixture of Vision Experts to Multimodal Context Paper • 2404.13046 • Published Apr 19 • 1
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Paper • 2406.11831 • Published Jun 17 • 21
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching Paper • 2404.03653 • Published Apr 4 • 33
Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models Paper • 2403.16999 • Published Mar 25 • 4
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction Paper • 2304.00967 • Published Apr 3, 2023
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths Paper • 2305.18295 • Published May 29, 2023 • 7
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29 • 46