YU310takuto
commited on
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Hugging faceのモデルのファインチューニングのテスト。Ver0.2
|
2 |
+
使用したデータセットは、「声優統計コーパス:日本声優統計学会( https://voice-statistics.github.io/ )」を全て入れたものになります。
|
3 |
+
CLAPを学習する際の、音声に付随するキャプションは、「Japanese female actor's (感情) voice」で固定したところ、
|
4 |
+
ファインチューニングしたモデルを用いてクラス分類したときに["happy", "angry", "normal"]と["happy voice", "angry voice", "normal voice"]で結果が変わりました。
|
5 |
+
原因はまだ謎です。
|
6 |
+
|
7 |
+
また、先日アップしたVer0.1はそのうち削除します。
|
8 |
+
|
9 |
+
Hugging Faceやclapのモデルを使っている日本人の有識者がいれば、ぜひ色々教えていただきたいです。
|
10 |
+
|
11 |
+
---
|
12 |
+
|
13 |
+
Fine-tuning test of the hugging face model. Ver0.2
|
14 |
+
|
15 |
+
The dataset used was the entire "Voice Actor Statistical Corpus: Japan Voice Actor Statistical Association (https://voice-statistics.github.io/)".
|
16 |
+
|
17 |
+
When learning CLAP, the captions accompanying the voice were fixed to "Japanese female actor's (emotion) voice",
|
18 |
+
|
19 |
+
and when classifying using the fine-tuned model, the results changed between ["happy", "angry", "normal"] and ["happy voice", "angry voice", "normal voice"].
|
20 |
+
|
21 |
+
The cause is still a mystery.
|
22 |
+
|
23 |
+
Also, I will delete Ver0.1 that was uploaded the other day.
|
24 |
+
|
25 |
+
If there are experts who use Hugging Face, or "clap model", I would love to hear more about it.
|
26 |
+
|
27 |
+
---
|
28 |
+
base_model:
|
29 |
+
- laion/larger_clap_music_and_speech
|
30 |
+
tags:
|
31 |
+
- CLAP
|
32 |
+
---
|