Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage Paper • 2412.15484 • Published 25 days ago • 14
VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance Paper • 2409.15759 • Published Sep 24, 2024 • 1
NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers Paper • 2409.15760 • Published Sep 24, 2024 • 1
Style-Friendly SNR Sampler for Style-Driven Generation Paper • 2411.14793 • Published Nov 22, 2024 • 36