Shivam Mehta's picture

2 8 7

Shivam Mehta

shivammehta25

·

http://www.shivammehta.me

AI & ML interests

Speech, Audio, LLM, Flow Matching, Diffusion, Flows, HMMs

Organizations

shivammehta25's activity

upvoted a paper 3 months ago

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published Sep 17 • 18

upvoted a paper 7 months ago

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Paper • 2406.04904 • Published Jun 7 • 4

upvoted a paper 8 months ago

Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis

Paper • 2404.19622 • Published Apr 30 • 2

upvoted a paper 9 months ago

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Paper • 2403.11781 • Published Mar 18 • 17

upvoted a paper 11 months ago

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 82

upvoted a paper about 1 year ago

Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Paper • 2312.03491 • Published Dec 6, 2023 • 33

upvoted 2 papers over 1 year ago

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis

Paper • 2306.09417 • Published Jun 15, 2023 • 3

Matcha-TTS: A fast TTS architecture with conditional flow matching

Paper • 2309.03199 • Published Sep 6, 2023 • 11