metadata
license: mit
pipeline_tag: audio-to-audio
MusicMaker - Transformer Model for Music Generation
Overview:
MusicMaker is a transformer-based model trained to generate novel musical compositions in the MIDI format. By learning from a dataset of piano MIDI files, the model can capture the intricate patterns and structures present in music and generate coherent and creative melodies.
Key Features:
- Generation of novel musical compositions in MIDI format
- Trained on a dataset of piano MIDI files
- Based on transformer architecture for capturing long-range dependencies
- Tokenizer trained specifically on MIDI data using miditok library
Training Data:
The model was trained on a dataset of ~11,000 piano MIDI files from the "adl-piano-midi" collection.
Model Details:
- Architecture: GPT-style transformer
- Number of layers: 12
- Hidden size: 512
- Attention heads: 8
- Tokenizer vocabulary size: 12,000
Usage:
from transformers import AutoModel
from miditok import MusicTokenizer
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = MusicTokenizer.from_pretrained('shikhr/music_maker')
model = AutoModel.from_pretrained('shikhr/music_maker', trust_remote_code=True)
model.to(device)
# Generate some music
out = model.generate(
torch.tensor([[1]]).to(device), max_new_tokens=400, temperature=1.0, top_k=None
)
# Save the generated MIDI
tokenizer(out[0].tolist()).dump_midi("generated.mid")
Limitations and Bias:
- The model has only been trained on piano MIDI data, so its ability to generalize to other instruments may be limited.
- The generated music may exhibit some repetitive or unnatural patterns.
- The training data itself may contain certain biases or patterns reflective of its sources.