|
--- |
|
license: mit |
|
pipeline_tag: audio-to-audio |
|
--- |
|
|
|
## MusicMaker - Transformer Model for Music Generation |
|
|
|
#### Overview: |
|
|
|
MusicMaker is a transformer-based model trained to generate novel musical compositions in the MIDI format. By learning from a dataset of piano MIDI files, the model can capture the intricate patterns and structures present in music and generate coherent and creative melodies. |
|
|
|
#### Key Features: |
|
|
|
- Generation of novel musical compositions in MIDI format |
|
- Trained on a dataset of piano MIDI files |
|
- Based on transformer architecture for capturing long-range dependencies |
|
- Tokenizer trained specifically on MIDI data using miditok library |
|
|
|
#### Training Data: |
|
|
|
The model was trained on a dataset of ~11,000 piano MIDI files from the "adl-piano-midi" collection. |
|
|
|
#### Model Details: |
|
|
|
- Architecture: GPT-style transformer |
|
- Number of layers: 12 |
|
- Hidden size: 512 |
|
- Attention heads: 8 |
|
- Tokenizer vocabulary size: 12,000 |
|
|
|
#### Usage: |
|
|
|
```py |
|
|
|
from transformers import AutoModel |
|
from miditok import MusicTokenizer |
|
import torch |
|
|
|
device = 'cuda' if torch.cuda.is_available() else 'cpu' |
|
|
|
tokenizer = MusicTokenizer.from_pretrained('shikhr/music_maker') |
|
|
|
model = AutoModel.from_pretrained('shikhr/music_maker', trust_remote_code=True) |
|
model.to(device) |
|
|
|
# Generate some music |
|
out = model.generate( |
|
torch.tensor([[1]]).to(device), max_new_tokens=400, temperature=1.0, top_k=None |
|
) |
|
|
|
# Save the generated MIDI |
|
tokenizer(out[0].tolist()).dump_midi("generated.mid") |
|
|
|
``` |
|
|
|
#### Limitations and Bias: |
|
|
|
- The model has only been trained on piano MIDI data, so its ability to generalize to other instruments may be limited. |
|
- The generated music may exhibit some repetitive or unnatural patterns. |
|
- The training data itself may contain certain biases or patterns reflective of its sources. |