MusicMaker - Transformer Model for Music Generation

Overview:

MusicMaker is a transformer-based model trained to generate novel musical compositions in the MIDI format. By learning from a dataset of piano MIDI files, the model can capture the intricate patterns and structures present in music and generate coherent and creative melodies.

Key Features:

  • Generation of novel musical compositions in MIDI format
  • Trained on a dataset of piano MIDI files
  • Based on transformer architecture for capturing long-range dependencies
  • Tokenizer trained specifically on MIDI data using miditok library

Training Data:

The model was trained on a dataset of ~11,000 piano MIDI files from the "adl-piano-midi" collection.

Model Details:

  • Architecture: GPT-style transformer
  • Number of layers: 12
  • Hidden size: 512
  • Attention heads: 8
  • Tokenizer vocabulary size: 12,000

Usage:


from transformers import AutoModel
from miditok import MusicTokenizer
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'

tokenizer = MusicTokenizer.from_pretrained('shikhr/music_maker')

model = AutoModel.from_pretrained('shikhr/music_maker', trust_remote_code=True)
model.to(device)

# Generate some music
out = model.generate(
    torch.tensor([[1]]).to(device), max_new_tokens=400, temperature=1.0, top_k=None
)

# Save the generated MIDI
tokenizer(out[0].tolist()).dump_midi("generated.mid")

Limitations and Bias:

  • The model has only been trained on piano MIDI data, so its ability to generalize to other instruments may be limited.
  • The generated music may exhibit some repetitive or unnatural patterns.
  • The training data itself may contain certain biases or patterns reflective of its sources.
Downloads last month
24
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.