Upload README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,8 @@
|
|
2 |
base_model: google/gemma-2-2b-jpn-it
|
3 |
language:
|
4 |
- multilingual
|
|
|
|
|
5 |
library_name: transformers
|
6 |
license: gemma
|
7 |
license_link: https://ai.google.dev/gemma/terms
|
@@ -28,10 +30,77 @@ Run them in [LM Studio](https://lmstudio.ai/)
|
|
28 |
|
29 |
## Download a file (not the whole branch) from below:
|
30 |
|
31 |
-
| Filename | Quant type | File Size | Split | Description |
|
32 |
-
| -------- | ---------- | --------- | ----- | ----------- |
|
33 |
-
| [gemma-2-2b-jpn-it
|
34 |
-
| [gemma-2-2b-jpn-it
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
## Downloading using huggingface-cli
|
37 |
|
@@ -50,3 +119,5 @@ huggingface-cli download ymcki/gemma-2-2b-jpn-it-GGUF --include "gemma-2-2b-jpn-
|
|
50 |
## Credits
|
51 |
|
52 |
Thank you bartowski for providing a README.md to get me started.
|
|
|
|
|
|
2 |
base_model: google/gemma-2-2b-jpn-it
|
3 |
language:
|
4 |
- multilingual
|
5 |
+
datasets:
|
6 |
+
- TFMC/imatrix-dataset-for-japanese-llm
|
7 |
library_name: transformers
|
8 |
license: gemma
|
9 |
license_link: https://ai.google.dev/gemma/terms
|
|
|
30 |
|
31 |
## Download a file (not the whole branch) from below:
|
32 |
|
33 |
+
| Filename | Quant type | File Size | Split | ELIZA-Tasks-100 | Description |
|
34 |
+
| -------- | ---------- | --------- | ----- | --------------- | ----------- |
|
35 |
+
| [gemma-2-2b-jpn-it.f16.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.f16.gguf) | f16 | 5.24GB | false | Full F16 weights. |
|
36 |
+
| [gemma-2-2b-jpn-it.Q8_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q8_0.gguf) | Q8_0 | 2.78GB | false | Extremely high quality, *recommended*. |
|
37 |
+
| [gemma-2-2b-jpn-it-imatrix.Q4_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0.gguf) | Q4_0 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
|
38 |
+
| [gemma-2-2b-jpn-it-imatrix.Q4_0_8_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0_8_8.gguf) | Q4_0_8_8 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
|
39 |
+
| [gemma-2-2b-jpn-it-imatrix.Q4_0_4_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0_4_8.gguf) | Q4_0_4_8 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
|
40 |
+
| [gemma-2-2b-jpn-it-imatrix.Q4_0_4_4.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it-imatrix.Q4_0_4_4.gguf) | Q4_0_4_4 | 2.78GB | false | Good quality, *recommended for edge device <8GB RAM*. |
|
41 |
+
| [gemma-2-2b-jpn-it.Q4_0.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0.gguf) | Q4_0 | 2.78GB | false | Poor quality, *not recommended*. |
|
42 |
+
| [gemma-2-2b-jpn-it.Q4_0_8_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0_8_8.gguf) | Q4_0_8_8 | 2.78GB | false | Poor quality, *not recommended*. |
|
43 |
+
| [gemma-2-2b-jpn-it.Q4_0_4_8.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0_4_8.gguf) | Q4_0_4_8 | 2.78GB | false | Poor quality, *not recommended*. |
|
44 |
+
| [gemma-2-2b-jpn-it.Q4_0_4_4.gguf](https://huggingface.co/ymcki/gemma-2-2b-jpn-it-GGUF/blob/main/gemma-2-2b-jpn-it.Q4_0_4_4.gguf) | Q4_0_4_4 | 2.78GB | false | Poor quality, *not recommended*. |
|
45 |
+
|
46 |
+
## How to check i8mm and sve support for ARM devices
|
47 |
+
|
48 |
+
ARM i8mm support is necessary to take advantage of Q4_0_4_8 gguf. All ARM architecure >= ARMv8.6-A supports i8mm.
|
49 |
+
|
50 |
+
ARM sve support is necessary to take advantage of Q4_0_8_8 gguf. sve is an optional feature that starts from ARMv8.2-A but majority of ARM chips doesn't implement it.
|
51 |
+
|
52 |
+
For ARM devices without both, it is recommended to use Q4_0_4_4.
|
53 |
+
|
54 |
+
For Apple devices,
|
55 |
+
|
56 |
+
```
|
57 |
+
sysctl hw
|
58 |
+
```
|
59 |
+
|
60 |
+
For ARM devices (ie most Android devices),
|
61 |
+
```
|
62 |
+
cat /proc/cpuinfo
|
63 |
+
```
|
64 |
+
|
65 |
+
There are also android apps that can display /proc/cpuinfo.
|
66 |
+
|
67 |
+
## Which Q4_0 model to use for ARM devices
|
68 |
+
| Brand | Series | Model | i8mm | sve | Quant Type |
|
69 |
+
| ----- | ------ | ----- | ---- | --- | -----------|
|
70 |
+
| Qualcomm|Snapdragon | >= 7 Gen 1 | Yes | Yes | Q4_0_8_8 |
|
71 |
+
| Qualcomm|Snapdragon | others | No | No | Q4_0_4_4 |
|
72 |
+
| Apple | M | M1 | No | No | Q4_0_4_4 |
|
73 |
+
| Apple | M | M2/M3/M4 | Yes | No | Q4_0_4_8 |
|
74 |
+
| Apple | A | A4 to A14 | No | No | Q4_0_4_4 |
|
75 |
+
| Apple | A | A15 to A18 | Yes | No | Q4_0_4_8 |
|
76 |
+
|
77 |
+
## Convert safetensors to f16 gguf
|
78 |
+
|
79 |
+
Make sure you have llama.cpp git cloned:
|
80 |
+
|
81 |
+
```
|
82 |
+
python3 convert_hf_to_gguf.py gemma-2-2b-jpn-it/ --outfile gemma-2-2b-jpn-it.f16.gguf --outtype f16
|
83 |
+
```
|
84 |
+
|
85 |
+
## Convert f16 gguf to Q8_0 gguf without imatrix
|
86 |
+
Make sure you have llama.cpp compiled:
|
87 |
+
```
|
88 |
+
./llama-quantize gemma-2-2b-jpn-it.f16.gguf gemma-2-2b-jpn-it.Q8_0.gguf q8_0
|
89 |
+
```
|
90 |
+
|
91 |
+
## Convert f16 gguf to other gguf with imatrix
|
92 |
+
|
93 |
+
First, prepare imatrix from f16 gguf and c4_en_ja_imatrix.txt
|
94 |
+
|
95 |
+
```
|
96 |
+
./llama-imatrix -m gemma-2-2b-jpn-it.f16.gguf -f c4_en_ja_imatrix.txt -o gemma-2-2b-jpn-it.imatrix --chunks 32
|
97 |
+
```
|
98 |
+
|
99 |
+
Then, convert f16 gguf with imatrix to create imatrix gguf
|
100 |
+
|
101 |
+
```
|
102 |
+
./llama-quantize --imatrix gemma-2-2b-jpn-it.imatrix gemma-2-2b-jpn-it.f16.gguf gemma-2-2b-jpn-it-imatrix.Q4_0_8_8.gguf q4_0_8_8
|
103 |
+
```
|
104 |
|
105 |
## Downloading using huggingface-cli
|
106 |
|
|
|
119 |
## Credits
|
120 |
|
121 |
Thank you bartowski for providing a README.md to get me started.
|
122 |
+
|
123 |
+
Thank you YoutechA320U for the ELYZA-tasks-100 auto evaluation tool.
|