eeishaan commited on
Commit
e825916
ยท
1 Parent(s): c01f745

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md CHANGED
@@ -1,3 +1,66 @@
1
  ---
 
2
  license: mit
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ '[object Object]': null
3
  license: mit
4
+ tags:
5
+ - speech
6
+ - deep-learning
7
+ - GAN
8
+ - generative-model
9
+ - RVQ
10
  ---
11
+
12
+
13
+ # Descript Audio Codec
14
+
15
+ ๐Ÿ‘‰ With Descript Audio Codec, you can compress **44.1 KHz audio** into discrete codes at a **low 8 kbps bitrate**. <br>
16
+ ๐ŸคŒ That's approximately **90x compression** while maintaining exceptional fidelity and minimizing artifacts. <br>
17
+ ๐Ÿ’ช Our universal model works on all domains (speech, environment, music, etc.), making it widely applicable to generative modeling of all audio. <br>
18
+ ๐Ÿ‘Œ It can be used as a drop-in replacement for EnCodec for all audio language modeling applications (such as AudioLMs, MusicLMs, MusicGen, etc.) <br>
19
+
20
+ ## Model Details
21
+
22
+ ### Model Description
23
+
24
+ - **License:** MIT
25
+
26
+ ### Model Sources
27
+
28
+ - **Repository:** [Github Repo](https://github.com/descriptinc/descript-audio-codec)
29
+ - **Paper:** [arXiv Paper: High-Fidelity Audio Compression with Improved RVQGAN
30
+ ](http://arxiv.org/abs/2306.06546)
31
+ - **Demo:** [Demo Site](https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5)
32
+
33
+ ## Uses
34
+
35
+ The model is intended for compressing audio files containing speech, music and environmental sounds.
36
+
37
+ ### Out-of-Scope Use
38
+
39
+ It is not intended to be used for compressing other file formats such as text, images, etc.
40
+
41
+ ## Bias, Risks, and Limitations
42
+ Our model has difficulty reconstructing some challenging audio. It
43
+ performs best for speech and has more issues with environmental sounds. It
44
+ does not model some musical instruments perfectly, such as glockenspeil, or synthesizer sounds.
45
+
46
+
47
+ ## How to Get Started with the Model
48
+ This model is meant to be used with our official repo linked above. We release the model here for redundancy purposes.
49
+ Our code is able to pull the weights from their
50
+ [original location on Github](https://github.com/descriptinc/descript-audio-codec/releases/download/0.0.1/weights.pth).
51
+ Please refer to the official [README](https://github.com/descriptinc/descript-audio-codec#readme) for usage instructions.
52
+
53
+ ## Citation
54
+
55
+ **BibTeX:**
56
+
57
+ ```
58
+ @misc{kumar2023highfidelity,
59
+ title={High-Fidelity Audio Compression with Improved RVQGAN},
60
+ author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
61
+ year={2023},
62
+ eprint={2306.06546},
63
+ archivePrefix={arXiv},
64
+ primaryClass={cs.SD}
65
+ }
66
+ ```