music_maker / README.md
shikhr's picture
Update README.md
6d3bcaf verified
metadata
license: mit
pipeline_tag: audio-to-audio

MusicMaker - Transformer Model for Music Generation

Overview:

MusicMaker is a transformer-based model trained to generate novel musical compositions in the MIDI format. By learning from a dataset of piano MIDI files, the model can capture the intricate patterns and structures present in music and generate coherent and creative melodies.

Key Features:

  • Generation of novel musical compositions in MIDI format
  • Trained on a dataset of piano MIDI files
  • Based on transformer architecture for capturing long-range dependencies
  • Tokenizer trained specifically on MIDI data using miditok library

Training Data:

The model was trained on a dataset of ~11,000 piano MIDI files from the "adl-piano-midi" collection.

Model Details:

  • Architecture: GPT-style transformer
  • Number of layers: 12
  • Hidden size: 512
  • Attention heads: 8
  • Tokenizer vocabulary size: 12,000

Usage:


from transformers import AutoModel
from miditok import MusicTokenizer
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'

tokenizer = MusicTokenizer.from_pretrained('shikhr/music_maker')

model = AutoModel.from_pretrained('shikhr/music_maker', trust_remote_code=True)
model.to(device)

# Generate some music
out = model.generate(
    torch.tensor([[1]]).to(device), max_new_tokens=400, temperature=1.0, top_k=None
)

# Save the generated MIDI
tokenizer(out[0].tolist()).dump_midi("generated.mid")

Limitations and Bias:

  • The model has only been trained on piano MIDI data, so its ability to generalize to other instruments may be limited.
  • The generated music may exhibit some repetitive or unnatural patterns.
  • The training data itself may contain certain biases or patterns reflective of its sources.