shikhr/music_maker · Hugging Face

MusicMaker - Transformer Model for Music Generation

Overview:

MusicMaker is a transformer-based model trained to generate novel musical compositions in the MIDI format. By learning from a dataset of piano MIDI files, the model can capture the intricate patterns and structures present in music and generate coherent and creative melodies.

Key Features:

Generation of novel musical compositions in MIDI format
Trained on a dataset of piano MIDI files
Based on transformer architecture for capturing long-range dependencies
Tokenizer trained specifically on MIDI data using miditok library

Training Data:

The model was trained on a dataset of ~11,000 piano MIDI files from the "adl-piano-midi" collection.

Model Details:

Architecture: GPT-style transformer
Number of layers: 12
Hidden size: 512
Attention heads: 8
Tokenizer vocabulary size: 12,000

Usage:


from transformers import AutoModel
from miditok import MusicTokenizer
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'

tokenizer = MusicTokenizer.from_pretrained('shikhr/music_maker')

model = AutoModel.from_pretrained('shikhr/music_maker', trust_remote_code=True)
model.to(device)

# Generate some music
out = model.generate(
    torch.tensor([[1]]).to(device), max_new_tokens=400, temperature=1.0, top_k=None
)

# Save the generated MIDI
tokenizer(out[0].tolist()).dump_midi("generated.mid")

Limitations and Bias:

The model has only been trained on piano MIDI data, so its ability to generalize to other instruments may be limited.
The generated music may exhibit some repetitive or unnatural patterns.
The training data itself may contain certain biases or patterns reflective of its sources.

shikhr
/

music_maker