tags:
- gpt2
- text-generation
- music-modeling
- music-generation
widget:
- text: PIECE_START
- text: >-
PIECE_START STYLE=JSFAKES GENRE=JSFAKES TRACK_START INST=48 BAR_START
NOTE_ON=60
- text: >-
PIECE_START STYLE=JSFAKES GENRE=JSFAKES TRACK_START INST=48 BAR_START
NOTE_ON=58
GPT-2 for Music
Language Models such as GPT-2 can be used for Music Generation. The idea is to represent pieces of music as texts, effectively reducing the task to Language Generation.
This model is a rather small instance of GPT-2 trained on TristanBehrens/js-fakes-4bars. The model generates 4 bars at a time of Bach-like chorales with four voices (soprano, alto, tenor, bass).
If you are contribute, if you want to say hello, if you want to know more, find me on LinkedIn
Model description
The model is GPT-2 with 6 decoders and 8 attention-heads each. The context length is 512. The embedding dimensions are 512 as well. The vocabulary size is 119.
Intended uses & limitations
This model is just a proof of concept. It shows that HuggingFace can be used to compose music.
How to use
You can immediately start generating music running these lines of code:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("TristanBehrens/js-fakes-4bars")
model = AutoModelForCausalLM.from_pretrained("TristanBehrens/js-fakes-4bars")
input_ids = tokenizer.encode("PIECE_START", return_tensors="pt")
print(input_ids)
generated_ids = model.generate(input_ids, max_length=500)
generated_sequence = tokenizer.decode(generated_ids[0])
print(generated_sequence)
Note that this just generates music as a text. In order to actually listen to the generated music, you can use this notebook.
Limitations and bias
Since this model has been trained on a very small corpus of music, it is overfitting heavily.
Training data
The model has been trained on Omar Peracha's JS Fake Chorales dataset, which is a fine collection of 500 Bach-like chorales.