Text Generation
GGUF
Indonesian
English
Inference Endpoints
Ichsan2895 commited on
Commit
c6150fe
1 Parent(s): ccda4ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -11,7 +11,7 @@ license: cc-by-nc-sa-4.0
11
  other: mistral
12
  ---
13
 
14
- # HAPPY TO ANNOUNCE THE RELEASE OF MERAK-7B-V4-PROTOTYPE6-GGUF!
15
 
16
  Merak-7B is the Large Language Model of Indonesian Language
17
 
@@ -51,14 +51,14 @@ They are also compatible with many third party UIs and libraries - please see th
51
 
52
  | Name | Quant method | Bits | Size | Use case |
53
  | ---- | ---- | ---- | ---- | ----- |
54
- | [Merak-7B-v4-PROTOTYPE6-model-Q2_K.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q2_k.gguf) | Q2_K | 2 | 3.08 GB| smallest, significant quality loss - not recommended for most purposes |
55
- | [Merak-7B-v4-PROTOTYPE6-model-Q3_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q3_k_m.gguf) | Q3_K_M | 3 | 3.52 GB| very small, high quality loss |
56
- | [Merak-7B-v4-PROTOTYPE6-model-Q4_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q4_0.gguf) | Q4_0 | 4 | 4.11 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
57
- | [Merak-7B-v4-PROTOTYPE6-model-Q4_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q4_k_m.gguf) | Q4_K_M | 4 | 4.37 GB| medium, balanced quality - recommended |
58
- | [Merak-7B-v4-PROTOTYPE6-model-Q5_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q5_0.gguf) | Q5_0 | 5 | 5 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
59
- | [Merak-7B-v4-PROTOTYPE6-model-Q5_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q5_k_m.gguf) | Q5_K_M | 5 | 5.13 GB| large, very low quality loss - recommended |
60
- | [Merak-7B-v4-PROTOTYPE6-model-Q6_K.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q6_k.gguf) | Q6_K | 6 | 5.94 GB| very large, extremely low quality loss |
61
- | [Merak-7B-v4-PROTOTYPE6-model-Q8_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF/blob/main/Merak-7B-v4-PROTOTYPE6-model-q8_0.gguf) | Q8_0 | 8 | 7.7 GB| very large, extremely low quality loss - not recommended |
62
 
63
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
64
 
@@ -85,7 +85,7 @@ The following clients/libraries will automatically download models for you, prov
85
 
86
  ### In `text-generation-webui`
87
 
88
- Under Download Model, you can enter the model repo: Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF and below it, a specific filename to download, such as: Merak-7B-v4-PROTOTYPE6-model-q5_k_m.gguf.
89
 
90
  Then click Download.
91
 
@@ -100,7 +100,7 @@ pip3 install huggingface-hub
100
  Then you can download any individual model file to the current directory, at high speed, with a command like this:
101
 
102
  ```shell
103
- huggingface-cli download Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF Merak-7B-v4-PROTOTYPE6-model-q5_k_m.gguf --local-dir . --local-dir-use-symlinks False
104
  ```
105
 
106
  <details>
@@ -109,7 +109,7 @@ huggingface-cli download Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF Merak-7B-v4-PROT
109
  You can also download multiple files at once with a pattern:
110
 
111
  ```shell
112
- huggingface-cli download Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
113
  ```
114
 
115
  For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
@@ -123,7 +123,7 @@ pip3 install hf_transfer
123
  And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
124
 
125
  ```shell
126
- HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF Merak-7B-v4-PROTOTYPE6-model-q5_k_m.gguf --local-dir . --local-dir-use-symlinks False
127
  ```
128
 
129
  Windows Command Line users: You can set the environment variable by running `set HF_HUB_ENABLE_HF_TRANSFER=1` before the download command.
@@ -136,7 +136,7 @@ Windows Command Line users: You can set the environment variable by running `set
136
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
137
 
138
  ```shell
139
- ./main -ngl 32 -m Merak-7B-v4-PROTOTYPE6-model-q5_k_m.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant"
140
  ```
141
 
142
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
@@ -178,7 +178,7 @@ CT_METAL=1 pip install ctransformers --no-binary ctransformers
178
  from ctransformers import AutoModelForCausalLM
179
 
180
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
181
- llm = AutoModelForCausalLM.from_pretrained("Ichsan2895/Merak-7B-v4-PROTOTYPE6-GGUF", model_file="Merak-7B-v4-PROTOTYPE6-model-q5_k_m.gguf", model_type="mistral", gpu_layers=50)
182
 
183
  print(llm("AI is going to"))
184
  ```
 
11
  other: mistral
12
  ---
13
 
14
+ # HAPPY TO ANNOUNCE THE RELEASE OF MERAK-7B-V4-GGUF!
15
 
16
  Merak-7B is the Large Language Model of Indonesian Language
17
 
 
51
 
52
  | Name | Quant method | Bits | Size | Use case |
53
  | ---- | ---- | ---- | ---- | ----- |
54
+ | [Merak-7B-v4-model-Q2_K.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q2_k.gguf) | Q2_K | 2 | 3.08 GB| smallest, significant quality loss - not recommended for most purposes |
55
+ | [Merak-7B-v4-model-Q3_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q3_k_m.gguf) | Q3_K_M | 3 | 3.52 GB| very small, high quality loss |
56
+ | [Merak-7B-v4-model-Q4_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q4_0.gguf) | Q4_0 | 4 | 4.11 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
57
+ | [Merak-7B-v4-model-Q4_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q4_k_m.gguf) | Q4_K_M | 4 | 4.37 GB| medium, balanced quality - recommended |
58
+ | [Merak-7B-v4-model-Q5_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q5_0.gguf) | Q5_0 | 5 | 5 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
59
+ | [Merak-7B-v4-model-Q5_K_M.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q5_k_m.gguf) | Q5_K_M | 5 | 5.13 GB| large, very low quality loss - recommended |
60
+ | [Merak-7B-v4-model-Q6_K.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q6_k.gguf) | Q6_K | 6 | 5.94 GB| very large, extremely low quality loss |
61
+ | [Merak-7B-v4-model-Q8_0.gguf](https://huggingface.co/Ichsan2895/Merak-7B-v4-GGUF/blob/main/Merak-7B-v4-model-q8_0.gguf) | Q8_0 | 8 | 7.7 GB| very large, extremely low quality loss - not recommended |
62
 
63
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
64
 
 
85
 
86
  ### In `text-generation-webui`
87
 
88
+ Under Download Model, you can enter the model repo: Ichsan2895/Merak-7B-v4-GGUF and below it, a specific filename to download, such as: Merak-7B-v4-model-q5_k_m.gguf.
89
 
90
  Then click Download.
91
 
 
100
  Then you can download any individual model file to the current directory, at high speed, with a command like this:
101
 
102
  ```shell
103
+ huggingface-cli download Ichsan2895/Merak-7B-v4-GGUF Merak-7B-v4-model-q5_k_m.gguf --local-dir . --local-dir-use-symlinks False
104
  ```
105
 
106
  <details>
 
109
  You can also download multiple files at once with a pattern:
110
 
111
  ```shell
112
+ huggingface-cli download Ichsan2895/Merak-7B-v4-GGUF --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
113
  ```
114
 
115
  For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
 
123
  And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
124
 
125
  ```shell
126
+ HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download Ichsan2895/Merak-7B-v4-GGUF Merak-7B-v4-model-q5_k_m.gguf --local-dir . --local-dir-use-symlinks False
127
  ```
128
 
129
  Windows Command Line users: You can set the environment variable by running `set HF_HUB_ENABLE_HF_TRANSFER=1` before the download command.
 
136
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
137
 
138
  ```shell
139
+ ./main -ngl 32 -m Merak-7B-v4-model-q5_k_m.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant"
140
  ```
141
 
142
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
178
  from ctransformers import AutoModelForCausalLM
179
 
180
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
181
+ llm = AutoModelForCausalLM.from_pretrained("Ichsan2895/Merak-7B-v4-GGUF", model_file="Merak-7B-v4-model-q5_k_m.gguf", model_type="mistral", gpu_layers=50)
182
 
183
  print(llm("AI is going to"))
184
  ```