File size: 663 Bytes
8f47998 b7d9437 8f47998 |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
---
license: mit
---
This is a streamlined interface version of [WavTokenizer-large-speech-75token](https://huggingface.co/novateur/WavTokenizer-large-speech-75token/tree/main), providing a clean, efficient way to interact with the model through separate encoder and decoder components.
- Reduced model size from 1.75GB to ~330MB by keeping only necessary components for inference
- Split interface (82MB encoder, 248MB decoder)
- Simplified integration with just [one .py](https://github.com/edwko/OuteTTS/blob/main/outetts/wav_tokenizer/model.py) file
The model is split into:
- `encoder/`: Handles audio encoding
- `decoder/`: Handles decoding and synthesis |