Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ This page confirms the effectiveness of multilingual imatrix.
|
|
29 |
|
30 |
|
31 |
|
32 |
-
|
33 |
|
34 |
wiki.test.raw score
|
35 |
Perplexity Score measured using [wiki.test.raw](https://huggingface.co/datasets/Salesforce/wikitext/viewer/wikitext-103-raw-v1/test) published by Salesforce and the [llama-perplexity](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md) command with -c 512 setting.
|
@@ -69,7 +69,7 @@ bartowskiが自モデルに良く使用している手法で、fp16をq8_0にし
|
|
69 |
Example:
|
70 |
```llama-quantize --allow-requantize --output-tensor-type q8_0 --token-embedding-type q8_0 --imatrix imatrix.dat gemma-2-9B-it-BF16.gguf gemma-2-9b-it-Q4_K_L.gguf Q4_k_m```
|
71 |
|
72 |
-
### Notes
|
73 |
|
74 |
- These results may vary depending on the model. It is best not to assume that these results apply to all models.
|
75 |
- Even under almost identical conditions, scores may increase or decrease slightly. It is better to focus on trends rather than small differences.
|
@@ -81,7 +81,8 @@ Example:
|
|
81 |
- imatrix-jpn-testモデルはbartowskiモデルに比べてimatrixに5倍のテキストを使用している事に留意してください。単純にテキストが多いため性能が微妙に増えている可能性があります
|
82 |
- 本来はperplexityではなく実タスクで性能を測定する事が望ましいです。しかし、実タスクのベンチマークも多様なのでその検証は皆さんにお任せします
|
83 |
|
84 |
-
### Considerations
|
|
|
85 |
- It seems that imatrix is effective in all cases.
|
86 |
- If you want to improve the performance of languages other than English even a little, it seems worth adding other languages. However, there is a possibility that your English ability may decrease.
|
87 |
- If you are only using English, the quantization variations may not make much difference.
|
@@ -89,15 +90,18 @@ Example:
|
|
89 |
- 英語以外の言語の性能を少しでも向上させたい場合は他言語を追加する価値はありそうです。しかし、英語能力が下がる可能性があります。
|
90 |
- 英語だけを使っている場合、量子化のバリエーションは大きな違いがない可能性があります
|
91 |
|
92 |
-
### Other references
|
|
|
93 |
The following information may be helpful in your further exploration.
|
94 |
以下の情報は更なる探求を行う際に参考になるかもしれません。
|
95 |
- [About imatrix overfitting, and importance of input text](https://github.com/ggerganov/llama.cpp/discussions/5263)
|
96 |
- [Importance matrix calculations work best on near-random data](https://github.com/ggerganov/llama.cpp/discussions/5006)
|
97 |
- [llama.cpp:iMatrix量子化は日本語性能にどう影響するか?](https://sc-bakushu.hatenablog.com/entry/2024/04/20/050213)
|
|
|
98 |
- [GGUFって結局どのサイズ選んだらいいの??](https://zenn.dev/yuki127/articles/e3337c176d27f2)
|
99 |
|
100 |
-
### Acknowledgements
|
|
|
101 |
Thanks to the llama.cpp community.
|
102 |
llama.cppのコミュニティの皆さんに感謝します。
|
103 |
Thanks to u/noneabove1182 for the advice and motivation.
|
@@ -110,7 +114,7 @@ I do not know all the inventors of each method, so please point out any that I h
|
|
110 |
- **Language(s) (NLP):** [English, Japanese]
|
111 |
- **Finetuned from model [optional]:** [gemma-2-9b-it]
|
112 |
|
113 |
-
## Citation [optional]
|
114 |
|
115 |
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
116 |
|
|
|
29 |
|
30 |
|
31 |
|
32 |
+
## 用語集 Terminology
|
33 |
|
34 |
wiki.test.raw score
|
35 |
Perplexity Score measured using [wiki.test.raw](https://huggingface.co/datasets/Salesforce/wikitext/viewer/wikitext-103-raw-v1/test) published by Salesforce and the [llama-perplexity](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md) command with -c 512 setting.
|
|
|
69 |
Example:
|
70 |
```llama-quantize --allow-requantize --output-tensor-type q8_0 --token-embedding-type q8_0 --imatrix imatrix.dat gemma-2-9B-it-BF16.gguf gemma-2-9b-it-Q4_K_L.gguf Q4_k_m```
|
71 |
|
72 |
+
### 注意事項 Notes
|
73 |
|
74 |
- These results may vary depending on the model. It is best not to assume that these results apply to all models.
|
75 |
- Even under almost identical conditions, scores may increase or decrease slightly. It is better to focus on trends rather than small differences.
|
|
|
81 |
- imatrix-jpn-testモデルはbartowskiモデルに比べてimatrixに5倍のテキストを使用している事に留意してください。単純にテキストが多いため性能が微妙に増えている可能性があります
|
82 |
- 本来はperplexityではなく実タスクで性能を測定する事が望ましいです。しかし、実タスクのベンチマークも多様なのでその検証は皆さんにお任せします
|
83 |
|
84 |
+
### 考察 Considerations
|
85 |
+
|
86 |
- It seems that imatrix is effective in all cases.
|
87 |
- If you want to improve the performance of languages other than English even a little, it seems worth adding other languages. However, there is a possibility that your English ability may decrease.
|
88 |
- If you are only using English, the quantization variations may not make much difference.
|
|
|
90 |
- 英語以外の言語の性能を少しでも向上させたい場合は他言語を追加する価値はありそうです。しかし、英語能力が下がる可能性があります。
|
91 |
- 英語だけを使っている場合、量子化のバリエーションは大きな違いがない可能性があります
|
92 |
|
93 |
+
### その他参考情報 Other references
|
94 |
+
|
95 |
The following information may be helpful in your further exploration.
|
96 |
以下の情報は更なる探求を行う際に参考になるかもしれません。
|
97 |
- [About imatrix overfitting, and importance of input text](https://github.com/ggerganov/llama.cpp/discussions/5263)
|
98 |
- [Importance matrix calculations work best on near-random data](https://github.com/ggerganov/llama.cpp/discussions/5006)
|
99 |
- [llama.cpp:iMatrix量子化は日本語性能にどう影響するか?](https://sc-bakushu.hatenablog.com/entry/2024/04/20/050213)
|
100 |
+
- [Command R+はどこまで量子化するとアホになってしまうのか?](https://soysoftware.sakura.ne.jp/archives/3834)
|
101 |
- [GGUFって結局どのサイズ選んだらいいの??](https://zenn.dev/yuki127/articles/e3337c176d27f2)
|
102 |
|
103 |
+
### 謝辞 Acknowledgements
|
104 |
+
|
105 |
Thanks to the llama.cpp community.
|
106 |
llama.cppのコミュニティの皆さんに感謝します。
|
107 |
Thanks to u/noneabove1182 for the advice and motivation.
|
|
|
114 |
- **Language(s) (NLP):** [English, Japanese]
|
115 |
- **Finetuned from model [optional]:** [gemma-2-9b-it]
|
116 |
|
117 |
+
## 引用 Citation [optional]
|
118 |
|
119 |
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
120 |
|