An end-to-end (e2e) Voice Language Model by Fish Audio.
Expressive Portrait Animation w/ Hierarchical Motion AttentΒ°
High-fidelity Virtual Try-on
Run GGUF directly on your browser!
2bit infer for llm
Compare Open LLM Leaderboard results
Demo for DocLayout-YOLO
A leaderboard for multimodal models
Vote on the top HF TTS models!
Fast, efficient, & multilingual text-to-speech
Realtime implementation of Whisper large turbo