dahara1 commited on
Commit
a47a83c
·
verified ·
1 Parent(s): ac7c072

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -6
README.md CHANGED
@@ -29,7 +29,7 @@ This page confirms the effectiveness of multilingual imatrix.
29
 
30
 
31
 
32
- #### Terminology
33
 
34
  wiki.test.raw score
35
  Perplexity Score measured using [wiki.test.raw](https://huggingface.co/datasets/Salesforce/wikitext/viewer/wikitext-103-raw-v1/test) published by Salesforce and the [llama-perplexity](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md) command with -c 512 setting.
@@ -69,7 +69,7 @@ bartowskiが自モデルに良く使用している手法で、fp16をq8_0にし
69
  Example:
70
  ```llama-quantize --allow-requantize --output-tensor-type q8_0 --token-embedding-type q8_0 --imatrix imatrix.dat gemma-2-9B-it-BF16.gguf gemma-2-9b-it-Q4_K_L.gguf Q4_k_m```
71
 
72
- ### Notes
73
 
74
  - These results may vary depending on the model. It is best not to assume that these results apply to all models.
75
  - Even under almost identical conditions, scores may increase or decrease slightly. It is better to focus on trends rather than small differences.
@@ -81,7 +81,8 @@ Example:
81
  - imatrix-jpn-testモデルはbartowskiモデルに比べてimatrixに5倍のテキストを使用している事に留意してください。単純にテキストが多いため性能が微妙に増えている可能性があります
82
  - 本来はperplexityではなく実タスクで性能を測定する事が望ましいです。しかし、実タスクのベンチマークも多様なのでその検証は皆さんにお任せします
83
 
84
- ### Considerations
 
85
  - It seems that imatrix is effective in all cases.
86
  - If you want to improve the performance of languages other than English even a little, it seems worth adding other languages. However, there is a possibility that your English ability may decrease.
87
  - If you are only using English, the quantization variations may not make much difference.
@@ -89,15 +90,18 @@ Example:
89
  - 英語以外の言語の性能を少しでも向上させたい場合は他言語を追加する価値はありそうです。しかし、英語能力が下がる可能性があります。
90
  - 英語だけを使っている場合、量子化のバリエーションは大きな違いがない可能性があります
91
 
92
- ### Other references
 
93
  The following information may be helpful in your further exploration.
94
  以下の情報は更なる探求を行う際に参考になるかもしれません。
95
  - [About imatrix overfitting, and importance of input text](https://github.com/ggerganov/llama.cpp/discussions/5263)
96
  - [Importance matrix calculations work best on near-random data](https://github.com/ggerganov/llama.cpp/discussions/5006)
97
  - [llama.cpp:iMatrix量子化は日本語性能にどう影響するか?](https://sc-bakushu.hatenablog.com/entry/2024/04/20/050213)
 
98
  - [GGUFって結局どのサイズ選んだらいいの??](https://zenn.dev/yuki127/articles/e3337c176d27f2)
99
 
100
- ### Acknowledgements
 
101
  Thanks to the llama.cpp community.  
102
  llama.cppのコミュニティの皆さんに感謝します。
103
  Thanks to u/noneabove1182 for the advice and motivation.
@@ -110,7 +114,7 @@ I do not know all the inventors of each method, so please point out any that I h
110
  - **Language(s) (NLP):** [English, Japanese]
111
  - **Finetuned from model [optional]:** [gemma-2-9b-it]
112
 
113
- ## Citation [optional]
114
 
115
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
116
 
 
29
 
30
 
31
 
32
+ ## 用語集 Terminology
33
 
34
  wiki.test.raw score
35
  Perplexity Score measured using [wiki.test.raw](https://huggingface.co/datasets/Salesforce/wikitext/viewer/wikitext-103-raw-v1/test) published by Salesforce and the [llama-perplexity](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md) command with -c 512 setting.
 
69
  Example:
70
  ```llama-quantize --allow-requantize --output-tensor-type q8_0 --token-embedding-type q8_0 --imatrix imatrix.dat gemma-2-9B-it-BF16.gguf gemma-2-9b-it-Q4_K_L.gguf Q4_k_m```
71
 
72
+ ### 注意事項 Notes
73
 
74
  - These results may vary depending on the model. It is best not to assume that these results apply to all models.
75
  - Even under almost identical conditions, scores may increase or decrease slightly. It is better to focus on trends rather than small differences.
 
81
  - imatrix-jpn-testモデルはbartowskiモデルに比べてimatrixに5倍のテキストを使用している事に留意してください。単純にテキストが多いため性能が微妙に増えている可能性があります
82
  - 本来はperplexityではなく実タスクで性能を測定する事が望ましいです。しかし、実タスクのベンチマークも多様なのでその検証は皆さんにお任せします
83
 
84
+ ### 考察 Considerations
85
+
86
  - It seems that imatrix is effective in all cases.
87
  - If you want to improve the performance of languages other than English even a little, it seems worth adding other languages. However, there is a possibility that your English ability may decrease.
88
  - If you are only using English, the quantization variations may not make much difference.
 
90
  - 英語以外の言語の性能を少しでも向上させたい場合は他言語を追加する価値はありそうです。しかし、英語能力が下がる可能性があります。
91
  - 英語だけを使っている場合、量子化のバリエーションは大きな違いがない可能性があります
92
 
93
+ ### その他参考情報 Other references
94
+
95
  The following information may be helpful in your further exploration.
96
  以下の情報は更なる探求を行う際に参考になるかもしれません。
97
  - [About imatrix overfitting, and importance of input text](https://github.com/ggerganov/llama.cpp/discussions/5263)
98
  - [Importance matrix calculations work best on near-random data](https://github.com/ggerganov/llama.cpp/discussions/5006)
99
  - [llama.cpp:iMatrix量子化は日本語性能にどう影響するか?](https://sc-bakushu.hatenablog.com/entry/2024/04/20/050213)
100
+ - [Command R+はどこまで量子化するとアホになってしまうのか?](https://soysoftware.sakura.ne.jp/archives/3834)
101
  - [GGUFって結局どのサイズ選んだらいいの??](https://zenn.dev/yuki127/articles/e3337c176d27f2)
102
 
103
+ ### 謝辞 Acknowledgements
104
+
105
  Thanks to the llama.cpp community.  
106
  llama.cppのコミュニティの皆さんに感謝します。
107
  Thanks to u/noneabove1182 for the advice and motivation.
 
114
  - **Language(s) (NLP):** [English, Japanese]
115
  - **Finetuned from model [optional]:** [gemma-2-9b-it]
116
 
117
+ ## 引用 Citation [optional]
118
 
119
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
120