English
music
music-captioning
seungheondoh commited on
Commit
a96e517
1 Parent(s): 31b2e83

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -14,4 +14,34 @@ tags:
14
  ---
15
 
16
  - **Repository:** [LP-MusicCaps repository](https://github.com/seungheondoh/lp-music-caps)
17
- - **Paper:** [ArXiv (Update Soon)](https://arxiv.org/abs/2307.16372)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
  - **Repository:** [LP-MusicCaps repository](https://github.com/seungheondoh/lp-music-caps)
17
+ - **Paper:** [ArXiv](https://arxiv.org/abs/2307.16372)
18
+
19
+ # :sound: LP-MusicCaps: LLM-Based Pseudo Music Captioning
20
+
21
+ [![Demo Video](https://i.imgur.com/cgi8NsD.jpg)](https://youtu.be/ezwYVaiC-AM)
22
+
23
+ This is a implementation of [LP-MusicCaps: LLM-Based Pseudo Music Captioning](#). This project aims to generate captions for music. 1) Tag-to-Caption: Using existing tags, We leverage the power of OpenAI's GPT-3.5 Turbo API to generate high-quality and contextually relevant captions based on music tag. 2) Audio-to-Caption: Using music-audio and pseudo caption pairs, we train a cross-model encoder-decoder model for end-to-end music captioning
24
+
25
+ > [**LP-MusicCaps: LLM-Based Pseudo Music Captioning**](#)
26
+ > SeungHeon Doh, Keunwoo Choi, Jongpil Lee, Juhan Nam
27
+ > To appear ISMIR 2023
28
+
29
+
30
+ ## TL;DR
31
+
32
+
33
+ <p align = "center">
34
+ <img src = "https://i.imgur.com/2LC0nT1.png">
35
+ </p>
36
+
37
+ - **[1.Tag-to-Caption: LLM Captioning](https://github.com/seungheondoh/lp-music-caps/tree/main/lpmc/llm_captioning)**: Generate caption from given tag input.
38
+ - **[2.Pretrain Music Captioning Model](https://github.com/seungheondoh/lp-music-caps/tree/main/lpmc/music_captioning)**: Generate pseudo caption from given audio.
39
+ - **[3.Transfer Music Captioning Model](https://github.com/seungheondoh/lp-music-caps/tree/main/lpmc/music_captioning/transfer.py)**: Generate human level caption from given audio.
40
+
41
+ ## Open Source Material
42
+
43
+ - [pre-trained models](https://huggingface.co/seungheondoh/lp-music-caps)
44
+ - [music-pseudo caption dataset](https://huggingface.co/datasets/seungheondoh/LP-MusicCaps-MSD)
45
+ - [demo](https://huggingface.co/spaces/seungheondoh/LP-Music-Caps-demo)
46
+
47
+ are available online for future research. example of dataset in [notebook](https://github.com/seungheondoh/lp-music-caps/blob/main/notebook/Dataset.ipynb)