File size: 8,823 Bytes
112bada f20d896 112bada 5552ef2 f20d896 2de35de f20d896 b48bc30 f20d896 53b3271 f20d896 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 |
---
license: mit
language: en
widget:
- text: "This is a traditional Irish dance music."
inference:
parameters:
top_p: 0.9
max_length: 1024
do_sample: True
---
# Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task
## Model description
This language-music model takes [BART-base](https://huggingface.co/facebook/bart-base) fine-tunes on 282,870 English text-music pairs, where all scores are represented in ABC notation. It was introduced in the paper [Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task](https://arxiv.org/abs/2211.11216) by Wu et al. and released in [this repository](https://github.com/sander-wood/text-to-music).
It is capable of generating complete and semantically consistent sheet music directly from descriptions in natural language based on text. To the best of our knowledge, this is the first model that achieves text-conditional symbolic music generation which is trained on real text-music pairs, and the music is generated entirely by the model and without any hand-crafted rules.
## Intended uses & limitations
You can use this model for text-conditional music generation. All scores generated by this model can be written on one stave (for vocal solo or instrumental solo) in standard classical notation, and are in a variety of styles, e.g., blues, classical, folk, jazz, pop, and world music. We recommend using the script in [this repository](https://github.com/sander-wood/text-to-music) for inference. The generated tunes are in ABC notation, and can be converted to sheet music or audio using [this website](https://ldzhangyx.github.io/abc/), or [this software](https://sourceforge.net/projects/easyabc/).
Its creativity is limited, can not perform well on tasks requiring a high degree of creativity (e.g., melody style transfer), and it is input-sensitive. For more information, please check [our paper](https://arxiv.org/abs/2211.11216).
### How to use
Here is how to use this model in PyTorch:
```python
import torch
from samplings import top_p_sampling, temperature_sampling
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained('sander-wood/text-to-music')
model = AutoModelForSeq2SeqLM.from_pretrained('sander-wood/text-to-music')
model = model
max_length = 1024
top_p = 0.9
temperature = 1.0
text = "This is a traditional Irish dance music."
input_ids = tokenizer(text,
return_tensors='pt',
truncation=True,
max_length=max_length)['input_ids']
decoder_start_token_id = model.config.decoder_start_token_id
eos_token_id = model.config.eos_token_id
decoder_input_ids = torch.tensor([[decoder_start_token_id]])
for t_idx in range(max_length):
outputs = model(input_ids=input_ids,
decoder_input_ids=decoder_input_ids)
probs = outputs.logits[0][-1]
probs = torch.nn.Softmax(dim=-1)(probs).detach().numpy()
sampled_id = temperature_sampling(probs=top_p_sampling(probs,
top_p=top_p,
return_probs=True),
temperature=temperature)
decoder_input_ids = torch.cat((decoder_input_ids, torch.tensor([[sampled_id]])), 1)
if sampled_id!=eos_token_id:
continue
else:
tune = "X:1\n"
tune += tokenizer.decode(decoder_input_ids[0], skip_special_tokens=True)
print(tune)
break
```
### Generation Examples
Here are some examples generated by this model without cherry-picking.
```
######################## INPUT TEXT ########################
This is a traditional Irish dance music.
Note Length-1/8
Meter-6/8
Key-D
####################### OUTPUT TUNES #######################
X:1
L:1/8
M:6/8
K:D
A | BEE BEE | Bdf edB | BAF FEF | DFA BAF | BEE BEE | Bdf edB | BAF DAF | FED E2 :: A |
Bef gfe | faf edB | BAF FEF | DFA BAF | Bef gfe | faf edB | BAF DAF | FED E2 :|
X:2
L:1/8
M:6/8
K:D
A |: DED F2 A | d2 f ecA | G2 B F2 A | E2 F GFE | DED F2 A | d2 f ecA | Bgf edc |1 d3 d2 A :|2
d3 d2 a || a2 f d2 e | f2 g agf | g2 e c2 d | e2 f gfe | fed gfe | agf bag | fed cde | d3 d2 a |
agf fed | Adf agf | gfe ecA | Ace gfe | fed gfe | agf bag | fed cde | d3 d2 ||
X:3
L:1/8
M:6/8
K:D
BEE BEE | Bdf edB | BAF FEF | DFA dBA | BEE BEE | Bdf edB | BAF FEF |1 DED DFA :|2 DED D2 e |:
faf edB | BAF DFA | BAF FEF | DFA dBA | faf edB | BAF DFA | BdB AFA |1 DED D2 e :|2 DED DFA ||
```
```
######################## INPUT TEXT ########################
This is a jazz-swing lead sheet with chord and vocal.
####################### OUTPUT TUNES #######################
X:1
L:1/8
M:4/4
K:F
"F" CFG |"F" A6 z G |"Fm7" A3 G"Bb7" A3 G |"F" A6 z G |"F7" A4"Eb7" G4 |"F" F6 z F |
"Dm" A3 G"Dm/C" A3 G |"Bb" A2"Gm" B2"C7" G3 G |"F" F8- |"Dm7""G7" F6 z2 |"C" C4 C3 C |
"C7" C2 B,2"F" C4 |"F" C4 C3 C |"Dm" D2 C2"Dm/C" D4 |"Bb" D4 D3 D |"Bb" D2 C2"C7" D4 |"F" C8- |
"F" C4"Gm" z C"C7" FG |"F" A6 z G |"Fm7" A3 G"Bb7" A3 G |"F" A6 z G |"F7" A4"Eb7" G4 |"F" F6 z F |
"Dm" A3 G"Dm/C" A3 G |"Bb" A2"Gm" B2"C7" G3 G |"F" F8- |"F" F6 z2 |]
X:2
L:1/4
M:4/4
K:F
"^A""F" A3 A |"Am7" A2"D7" A2 |"Gm7" G2"C7" G A |"F" F4 |"F" A3 A |"Am7" A2"D7" A2 |"Gm7" G2"C7" G A |
"F" F4 |"Gm" B3 B |"Am7" B2"D7" B2 |"Gm" B2"D7" B A |"Gm7" G4 |"F" A3 A |"Am7" A2"D7" A2 |
"Gm7" G2"C7" G A |"F" F4 |"Bb7" F3 G |"F" A2 A2 |"Gm" B2"C7" B2 |"F" c2"D7" c c |"Gm7" c2"C7" B2 |
"F" A2"F7" A2 |"Bb" B2"F" B A |"Bb" B2"F" B A |"Gm" B2"F" B A |"Gm7" B2"F" B A |"Gm7" B2"F" B A |
"C7" B2 c2 |"F""Bb7" A4 |"F""Bb7" z4 |]
X:3
L:1/4
M:4/4
K:Bb
B, ||"Gm""^A1" G,2 B, D |"D7" ^F A2 G/=F/ |"Gm" G2"Cm7" B c |"F7" A2 G =F |"Bb" D2 F A |
"Cm7" c e2 d/c/ |"Gm7" B3/2 G/-"C7" G2- |"F7" G2 z B, |"Gm""^B" G,2 B, D |"D7" ^F A2 G/=F/ |
"Gm" G2"Cm7" B c |"F7" A2 G =F |"Bb" D2 F A |"Cm7" c e2 d/c/ |"Gm7" B3/2 G/-"C7" G2- |"F7" G2 z2 ||
"^C""F7""^A2" F4- | F E D C |"Bb" D2 F B | d3 c/B/ |"F" A2"Cm7" G2 |"D7" ^F2 G2 |"Gm" B3"C7" A |
"F7" G4 ||"F7""^A3" F4- | F E D C |"Bb" D2 F B | d3 c/B/ |"F" A2"Cm7" G2 |"D7" ^F2 G2 |"Gm" B3 A |
"C7" G4 ||"^B""Gm""^C" B2 c B |"Cm" c B c B |"Gm7" c2 B A |"C7" B3 A |"Bb" B2 c B |"G7" d c B A |
"Cm" G2 A G |"F7" F2 z G ||"^C""F7" F F3 |"Bb" D D3 |"Cm" E E3 |"D7" ^F F3 |"Gm" G2 A B |"C7" d3 d |
"Gm" d3 d |"D7" d3 B, ||"^D""Gm" G,2 B, D |"D7" ^F A2 G/=F/ |"Gm" G2"Cm7" B c |"F7" A2 G =F |
"Bb" D2 F A |"Cm7" c e2 d/c/ |"Gm7" B3/2 G/-"C7" G2- |"F7" G2 z2 |]
```
```
######################## INPUT TEXT ########################
This is a Chinese folk song from the Jiangnan region. It was created during the Qianlong era (1735-1796) of the Qing dynasty. Over time, many regional variations were created, and the song gained popularity both in China and abroad. One version of the song describes a custom of giving jasmine flowers, popular in the southern Yangtze delta region of China.
####################### OUTPUT TUNES #######################
X:1
L:1/8
Q:1/4=100
M:2/4
K:C
"^Slow" DA A2 | GA c2- | c2 G2 | c2 GF | GA/G/ F2 | E2 DC | DA A2 | GA c2- | c2 GA | cd- d2 |
cA c2- | c2 GA | cd- d2 | cA c2- | c2 GA | c2 A2 | c2 d2 | cA c2- | c2 c2 | A2 G2 | F2 AG | F2 ED |
CA,/C/ D2- | D2 CD | F2 A2 | G2 ED | CG A2 | G2 FD | CA,/C/ D2- | D2 CD | F2 A2 | G2 ED |
CG A2 | G2 FD | CA,/C/ D2- | D2 z2 :|
X:2
L:1/8
Q:1/4=100
M:2/4
K:C
"^ MDolce" Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | EG ed | c2 AG | cA cd |
A2 AG | E2 ED | CD E2- | E2 z2 |"^ howeveroda" Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- |
E2 z2 | A2 cA | GA E2- | E2 z2 | GA cd | e2 ed | cd e2- | e2 z2 | ge d2 | cd c2- | c2 z2 |
Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | EG ed | c2 AG | cA cd | A2 AG | E2 ED |
CD E2- | E2 z2 |"^DDtisata" Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | A2 cA |
GA E2- | E2 z2 | GA cd | e2 ed | cd e2- | e2 z2 | ge d2 | cd c2- | c2 z2 | Ac de | d2 AG |
cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 |
Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 |"^ Easy" Ac de | d2 AG | cA cd |
A2 AG | E2 ED | CD E2- | E2 z2 | Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 |]
X:3
L:1/8
Q:1/4=60
M:4/4
K:C
"^S books defe.." AA A2 cdcc | AcAG A4- | A8 | A,4 CD C2 | A,4 cdcA | A2 GA- A4- | A2 GA A2 AA |
AG E2 D2 C2 | D6 ED | C2 D4 C2 | D2 C2 D4 | C2 A,2 CD C2 | A,4 cdcA | A2 GA- A4- | A2 GA A2 AA |
AG E2 D2 C2 | D6 z2 |]
```
### BibTeX entry and citation info
```bibtex
@inproceedings{
wu2023exploring,
title={Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task},
author={Shangda Wu and Maosong Sun},
booktitle={The AAAI-23 Workshop on Creative AI Across Modalities},
year={2023},
url={https://openreview.net/forum?id=QmWXskBhesn}
}
``` |