ymcki commited on
Commit
32a17dc
·
verified ·
1 Parent(s): 58bcac9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -4
README.md CHANGED
@@ -2,6 +2,8 @@
2
  base_model: google/gemma-2-2b-jpn-it
3
  language:
4
  - multilingual
 
 
5
  library_name: transformers
6
  license: gemma
7
  license_link: https://ai.google.dev/gemma/terms
@@ -28,10 +30,77 @@ Run them in [LM Studio](https://lmstudio.ai/)
28
 
29
  ## Download a file (not the whole branch) from below:
30
 
31
- | Filename | Quant type | File Size | Split | Description |
32
- | -------- | ---------- | --------- | ----- | ----------- |
33
- | [gemma-2-2b-jpn-it-f16.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUFP/blob/main/gemma-2-2b-jpn-it-f16.gguf) | f16 | 5.24GB | false | Full F16 weights. |
34
- | [gemma-2-2b-jpn-it-Q8_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUFP/blob/main/gemma-2-2b-jpn-it-Q8_0.gguf) | Q8_0 | 2.78GB | false | Extremely high quality, *recommended*. |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ## Downloading using huggingface-cli
37
 
@@ -50,3 +119,5 @@ huggingface-cli download ymcki/gemma-2-2b-jpn-it-GGUF --include "gemma-2-2b-jpn-
50
  ## Credits
51
 
52
  Thank you bartowski for providing a README.md to get me started.
 
 
 
2
  base_model: google/gemma-2-2b-jpn-it
3
  language:
4
  - multilingual
5
+ datasets:
6
+ - TFMC/imatrix-dataset-for-japanese-llm
7
  library_name: transformers
8
  license: gemma
9
  license_link: https://ai.google.dev/gemma/terms
 
30
 
31
  ## Download a file (not the whole branch) from below:
32
 
33
+ | Filename | Quant type | File Size | Split | ELIZA-Tasks-100 | Description |
34
+ | -------- | ---------- | --------- | ----- | --------------- | ----------- |
35
+ | [gemma-2-2b-jpn-it.f16.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.f16.gguf) | f16 | 5.24GB | false | Full F16 weights. |
36
+ | [gemma-2-2b-jpn-it.Q8_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q8_0.gguf) | Q8_0 | 2.78GB | false | Extremely high quality, *recommended*. |
37
+ | [gemma-2-2b-jpn-it-imatrix.Q4_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0.gguf) | Q4_0 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
38
+ | [gemma-2-2b-jpn-it-imatrix.Q4_0_8_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0_8_8.gguf) | Q4_0_8_8 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
39
+ | [gemma-2-2b-jpn-it-imatrix.Q4_0_4_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0_4_8.gguf) | Q4_0_4_8 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
40
+ | [gemma-2-2b-jpn-it-imatrix.Q4_0_4_4.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0_4_4.gguf) | Q4_0_4_4 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
41
+ | [gemma-2-2b-jpn-it.Q4_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0.gguf) | Q4_0 | 2.78GB | false | Poor quality, *not recommended*. |
42
+ | [gemma-2-2b-jpn-it.Q4_0_8_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0_8_8.gguf) | Q4_0_8_8 | 2.78GB | false | Poor quality, *not recommended*. |
43
+ | [gemma-2-2b-jpn-it.Q4_0_4_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0_4_8.gguf) | Q4_0_4_8 | 2.78GB | false | Poor quality, *not recommended*. |
44
+ | [gemma-2-2b-jpn-it.Q4_0_4_4.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0_4_4.gguf) | Q4_0_4_4 | 2.78GB | false | Poor quality, *not recommended*. |
45
+
46
+ ## How to check i8mm and sve support for ARM devices
47
+
48
+ ARM i8mm support is necessary to take advantage of Q4_0_4_8 gguf. All ARM architecure >= ARMv8.6-A supports i8mm.
49
+
50
+ ARM sve support is necessary to take advantage of Q4_0_8_8 gguf. sve is an optional feature that starts from ARMv8.2-A but majority of ARM chips doesn't implement it.
51
+
52
+ For ARM devices without both, it is recommended to use Q4_0_4_4.
53
+
54
+ For Apple devices,
55
+
56
+ ```
57
+ sysctl hw
58
+ ```
59
+
60
+ For ARM devices (ie most Android devices),
61
+ ```
62
+ cat /proc/cpuinfo
63
+ ```
64
+
65
+ There are also android apps that can display /proc/cpuinfo.
66
+
67
+ ## Which Q4_0 model to use for ARM devices
68
+ | Brand | Series | Model | i8mm | sve | Quant Type |
69
+ | ----- | ------ | ----- | ---- | --- | -----------|
70
+ | Qualcomm|Snapdragon | >= 7 Gen 1 | Yes | Yes | Q4_0_8_8 |
71
+ | Qualcomm|Snapdragon | others | No | No | Q4_0_4_4 |
72
+ | Apple | M | M1 | No | No | Q4_0_4_4 |
73
+ | Apple | M | M2/M3/M4 | Yes | No | Q4_0_4_8 |
74
+ | Apple | A | A4 to A14 | No | No | Q4_0_4_4 |
75
+ | Apple | A | A15 to A18 | Yes | No | Q4_0_4_8 |
76
+
77
+ ## Convert safetensors to f16 gguf
78
+
79
+ Make sure you have llama.cpp git cloned:
80
+
81
+ ```
82
+ python3 convert_hf_to_gguf.py gemma-2-2b-jpn-it/ --outfile gemma-2-2b-jpn-it.f16.gguf --outtype f16
83
+ ```
84
+
85
+ ## Convert f16 gguf to Q8_0 gguf without imatrix
86
+ Make sure you have llama.cpp compiled:
87
+ ```
88
+ ./llama-quantize gemma-2-2b-jpn-it.f16.gguf gemma-2-2b-jpn-it.Q8_0.gguf q8_0
89
+ ```
90
+
91
+ ## Convert f16 gguf to other gguf with imatrix
92
+
93
+ First, prepare imatrix from f16 gguf and c4_en_ja_imatrix.txt
94
+
95
+ ```
96
+ ./llama-imatrix -m gemma-2-2b-jpn-it.f16.gguf -f c4_en_ja_imatrix.txt -o gemma-2-2b-jpn-it.imatrix --chunks 32
97
+ ```
98
+
99
+ Then, convert f16 gguf with imatrix to create imatrix gguf
100
+
101
+ ```
102
+ ./llama-quantize --imatrix gemma-2-2b-jpn-it.imatrix gemma-2-2b-jpn-it.f16.gguf gemma-2-2b-jpn-it-imatrix.Q4_0_8_8.gguf q4_0_8_8
103
+ ```
104
 
105
  ## Downloading using huggingface-cli
106
 
 
119
  ## Credits
120
 
121
  Thank you bartowski for providing a README.md to get me started.
122
+
123
+ Thank you YoutechA320U for the ELYZA-tasks-100 auto evaluation tool.