Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
abdar1925 's Collections
Computer Use Models
Vision models
Reasoning Models
Code Models
Document models
Omni-models
Audio models
Papers
Models
Datasets
Embd Models

Audio models

updated May 29
Upvote
-

  • kyutai/moshika-vis-pytorch-bf16

    Updated Jun 18 • 56

  • sesame/csm-1b

    Text-to-Speech • Updated 26 days ago • 27.9k • 2.17k

  • kyutai/mimi

    Feature Extraction • 0.1B • Updated Jul 2 • 371k • • 230

  • kyutai/moshiko-pytorch-bf16

    Updated Sep 18, 2024 • 172k • 185

  • nvidia/canary-1b-flash

    Automatic Speech Recognition • 0.8B • Updated 1 day ago • 16.7k • 242

  • canopylabs/orpheus-3b-0.1-ft

    Text-to-Speech • 4B • Updated May 6 • 17.1k • • 610

  • stepfun-ai/Step-Audio-Chat

    Audio-Text-to-Text • 132B • Updated Feb 17 • 33 • 454

  • Zyphra/Zonos-v0.1-hybrid

    Text-to-Speech • Updated Jun 3 • 67.9k • 1.1k

  • hexgrad/Kokoro-82M

    Text-to-Speech • Updated Apr 10 • 1.88M • • 4.87k

  • Qwen/Qwen2.5-Omni-7B

    Any-to-Any • 11B • Updated Apr 30 • 132k • 1.75k

  • ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

    Paper • 2503.21144 • Published Mar 27 • 27

  • nari-labs/Dia-1.6B

    Text-to-Speech • Updated Jun 1 • 99.4k • • 2.7k

  • nvidia/parakeet-tdt-0.6b-v2

    Automatic Speech Recognition • Updated 1 day ago • 367k • 1.3k

  • ResembleAI/chatterbox

    Text-to-Speech • Updated May 30 • 829k • • 994
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs