Tae-Hyun Oh's picture

Tae-Hyun Oh

taehyunoh

http://ami.postech.ac.kr

AI & ML interests

None yet

Recent Activity

authored a paper about 1 month ago

Prefix tuning for automated audio captioning

authored a paper about 1 month ago

Sound Source Localization is All about Cross-Modal Alignment

authored a paper about 1 month ago

Speech2Face: Learning the Face Behind a Voice

View all activity

Organizations

None yet

taehyunoh's activity

authored 17 papers about 1 month ago

Prefix tuning for automated audio captioning

Paper • 2303.17489 • Published Mar 30, 2023

Sound Source Localization is All about Cross-Modal Alignment

Paper • 2309.10724 • Published Sep 19, 2023

Speech2Face: Learning the Face Behind a Voice

Paper • 1905.09773 • Published May 23, 2019

LaughTalk: Expressive 3D Talking Head Generation with Laughter

Paper • 2311.00994 • Published Nov 2, 2023

SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models

Paper • 2312.09818 • Published Dec 15, 2023

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering

Paper • 2312.11360 • Published Dec 18, 2023 • 1

FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning

Paper • 2108.06098 • Published Aug 13, 2021 • 2

TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation

Paper • 2307.14611 • Published Jul 27, 2023

Noise Map Guidance: Inversion with Spatial Context for Real Image Editing

Paper • 2402.04625 • Published Feb 7, 2024

Object-Centric Domain Randomization for 3D Shape Reconstruction in the Wild

Paper • 2403.14539 • Published Mar 21, 2024

Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers

Paper • 2207.13820 • Published Jul 27, 2022

Scratching Visual Transformer's Back with Uniform Attention

Paper • 2210.08457 • Published Oct 16, 2022

MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset

Paper • 2406.14272 • Published Jun 20, 2024

Contextually Customized Video Summaries via Natural Language

Paper • 1702.01528 • Published Feb 6, 2017

BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models

Paper • 2407.13442 • Published Jul 18, 2024

Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert

Paper • 2407.01034 • Published Jul 1, 2024

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Paper • 2411.19527 • Published Nov 29, 2024 • 10