README.md · OuteAI/wavtokenizer-large-75token-interface at b7d9437d07dfeb8d51d6e2098b5640fe54ee4641

metadata

license: mit

This is a streamlined interface version of WavTokenizer-large-speech-75token, providing a clean, efficient way to interact with the model through separate encoder and decoder components.

Reduced model size from 1.75GB to ~330MB by keeping only necessary components for inference
Split interface (82MB encoder, 248MB decoder)
Simplified integration with just one .py file

The model is split into:

encoder/: Handles audio encoding
decoder/: Handles decoding and synthesis