SMuPT: Symbolic Music Generative Pre-trained Transformer
SMuPT is a series of pre-trained models for symbolic music generation. It was trained on a large-scale dataset of symbolic music, including millions of monophonic and polyphonic pieces from different genres and styles. The models are trained with the LLama2 architecture, and can be further used for downstream music generation tasks such as melody generation, accompaniment generation, and multi-track music generation.
09/01/2024: a series of pre-trained SMuPT models are released, with parameters ranging from 110M to 1.3B.
Model architecture
The details of model architecture of SMuPT-v0 are listed below:
Name
Parameters
Training Data(Music Pieces)
Seq Length
Hidden Size
Layers
Heads
SMuPT-v0-8192-110M
110M
7M x 5.8 epochs
8192
768
12
12
SMuPT-v0-8192-345M
345M
7M x 4 epochs
8192
1024
24
16
SMuPT-v0-8192-770M
770M
7M x 3 epochs
8192
1280
36
20
SMuPT-v0-8192-1.3B
1.3B
7M x 2.2 epochs
8192
1536
48
24
Model Usage
There are several ways to use our pre-trained SMuPT models, we now the usage based on Megatron-LM. Huggingface format will be supported soon.
Before starting, make sure you have setup the relevant environment and codebase.
# pull Megatron-LM codebase
mkdir -p /path/to/workspace && cd /path/to/workspace
git clone https://github.com/NVIDIA/Megatron-LM.git
# download the pre-trained SMuPT models checkpoint and vocab files from Huggingface page
mkdir -p /models/SMuPT_v0_8192_1.3B && cd /models/SMuPT_v0_8192_1.3B
wget -O model_optim_rng.pt https://huggingface.co/m-a-p/SMuPT_v0_8192_1.3B/resolve/main/model_optim_rng.pt?download=true
wget -O newline.vocab https://huggingface.co/m-a-p/SMuPT_v0_8192_1.3B/resolve/main/newline.vocab?download=true
wget -O newline.txt https://huggingface.co/m-a-p/SMuPT_v0_8192_1.3B/resolve/main/newline.txt?download=true
# pull the latest NGC's PyTorch container, mount the workspace directory and enter the container
docker run --gpus all -it --name megatron --shm-size=16g -v $PWD:/workspace -p 5000:5000 nvcr.io/nvidia/pytorch:23.11-py3 /bin/bash
Once you enter the container, you can start a REST server for inference.
Use CURL to query the server directly, note that the newline token \n is represented by <n> in the vocabulary, so we need to replace the newline token with <n> in both the prompt and the generated tokens.