Commit
·
9369481
1
Parent(s):
e1a9899
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,39 @@
|
|
1 |
---
|
2 |
license: openrail
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: openrail
|
3 |
---
|
4 |
+
# Oud (عود) Unconditional Diffusion
|
5 |
+
|
6 |
+
The Oud is one of the most foundational instruments to all of Arab music. It can be heard in nearly every song, whether the subgenre is rooted in pop or classical music.
|
7 |
+
Its distinguishing sound can be picked out of a crowd of string instruments with little to no training.
|
8 |
+
Our Unconditional Diffusion model ensures that we show respect to the sound and culture it has created.
|
9 |
+
This project could not have been done without [the following audio diffusion tools.](https://github.com/teticio/audio-diffusion)
|
10 |
+
|
11 |
+
## Usage
|
12 |
+
|
13 |
+
Usage of this model is no different from any other audio diffusion model from HuggingFace.
|
14 |
+
|
15 |
+
```python
|
16 |
+
import torch
|
17 |
+
from diffusers import DiffusionPipeline
|
18 |
+
|
19 |
+
# Setup device and create generator
|
20 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
21 |
+
generator = torch.Generator(device=device)
|
22 |
+
|
23 |
+
# Instantiate model
|
24 |
+
model_id = "mijwiz-laboratories/oud_diffusion_unconditional_256"
|
25 |
+
audio_diffusion = DiffusionPipeline.from_pretrained(model_id).to(device)
|
26 |
+
|
27 |
+
# Set seed for generator
|
28 |
+
seed = generator.seed()
|
29 |
+
generator.manual_seed(seed)
|
30 |
+
|
31 |
+
# Run inference
|
32 |
+
output = audio_diffusion(generator=generator)
|
33 |
+
image = output.images[0] # Mel spectrogram generated
|
34 |
+
audio = output.audios[0, 0] # Playable audio file
|
35 |
+
```
|
36 |
+
|
37 |
+
## Limitations of Model
|
38 |
+
The dataset used was very small, so the diversity of snippets that can be generated is rather limited. Furthermore, with high intensity segments (think a human playing the instrument with high intensity,)
|
39 |
+
the realism/naturalness of the generated oud samples degrades.
|