descript
/

descript-audio-codec

generative-adversarial-network

compression-algorithm

audio-compression

Model card Files Files and versions Community

eeishaan commited on Jun 14, 2023

Commit

e825916

·

1 Parent(s): c01f745

Update README.md

Files changed (1) hide show

README.md +63 -0

README.md CHANGED Viewed

@@ -1,3 +1,66 @@
 ---
 license: mit
 ---

 ---
+'[object Object]': null
 license: mit
+tags:
+- speech
+- deep-learning
+- GAN
+- generative-model
+- RVQ
 ---
+# Descript Audio Codec
+👉 With Descript Audio Codec, you can compress **44.1 KHz audio** into discrete codes at a **low 8 kbps bitrate**.  <br>
+🤌 That's approximately **90x compression** while maintaining exceptional fidelity and minimizing artifacts.  <br>
+💪 Our universal model works on all domains (speech, environment, music, etc.), making it widely applicable to generative modeling of all audio.  <br>
+👌 It can be used as a drop-in replacement for EnCodec for all audio language modeling applications (such as AudioLMs, MusicLMs, MusicGen, etc.) <br>
+## Model Details
+### Model Description
+- **License:** MIT
+### Model Sources
+- **Repository:** [Github Repo](https://github.com/descriptinc/descript-audio-codec)
+- **Paper:** [arXiv Paper: High-Fidelity Audio Compression with Improved RVQGAN
+](http://arxiv.org/abs/2306.06546)
+- **Demo:** [Demo Site](https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5)
+## Uses
+The model is intended for compressing audio files containing speech, music and environmental sounds.
+### Out-of-Scope Use
+It is not intended to be used for compressing other file formats such as text, images, etc.
+## Bias, Risks, and Limitations
+Our model has difficulty reconstructing some challenging audio. It
+performs best for speech and has more issues with environmental sounds. It
+does not model some musical instruments perfectly, such as glockenspeil, or synthesizer sounds.
+## How to Get Started with the Model
+This model is meant to be used with our official repo linked above. We release the model here for redundancy purposes.
+Our code is able to pull the weights from their
+[original location on Github](https://github.com/descriptinc/descript-audio-codec/releases/download/0.0.1/weights.pth).
+Please refer to the official [README](https://github.com/descriptinc/descript-audio-codec#readme) for usage instructions.
+## Citation
+**BibTeX:**
+```
+@misc{kumar2023highfidelity,
+      title={High-Fidelity Audio Compression with Improved RVQGAN},
+      author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
+      year={2023},
+      eprint={2306.06546},
+      archivePrefix={arXiv},
+      primaryClass={cs.SD}
+}
+```