shaowenchen
/

colossal-llama-2-7b-base-gguf

Text Generation

Model card Files Files and versions Community

shaowenchen commited on Sep 25, 2023

Commit

68bd6b9

·

1 Parent(s): 4433df2

improvement: readme

Files changed (1) hide show

README.md +18 -11

README.md CHANGED Viewed

@@ -12,7 +12,6 @@ quantized_by: shaowenchen
 tasks:
   - text2text-generation
 tags:
-  - meta
   - gguf
   - llama
   - llama-2
@@ -42,20 +41,28 @@ tags:
 Usage:
 ```bash
-docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/gguf-model-name.gguf hubimage/llama-cpp-python:latest
 ```
 ## Provided images
-| Name                                               | Quant method | Size    |
-| -------------------------------------------------- | ------------ | ------- |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q2_K`   | Q2_K         | 3.68 GB |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q3_K`   | Q3_K         | 4.16 GB |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q3_K_L` | Q3_K_L       | 4.46 GB |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q3_K_S` | Q3_K_S       | 3.81 GB |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_0`   | Q4_0         | 4.7 GB  |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_K`   | Q4_K         | 4.95 GB |
-| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_K_S` | Q4_K_S       | 4.73 GB |
 Usage:

 tasks:
   - text2text-generation
 tags:
   - gguf
   - llama
   - llama-2
 Usage:
 ```bash
+docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/gguf-model-name.gguf shaowenchen/llama-cpp-python:0.2.6
 ```
 ## Provided images
+| Name                                               | Quant method | Compressed Size |
+| -------------------------------------------------- | ------------ | --------------- |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q2_K`   | Q2_K         | 3.68 GB         |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q3_K`   | Q3_K         | 4.16 GB         |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q3_K_L` | Q3_K_L       | 4.46 GB         |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q3_K_S` | Q3_K_S       | 3.81 GB         |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_0`   | Q4_0         | 4.7 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_1`   | Q4_1         | 5.1 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_K`   | Q4_K         | 4.95 GB         |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q4_K_S` | Q4_K_S       | 4.73 GB         |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q5_0`   | Q5_0         | 5.3 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q5_1`   | Q5_1         | 5.7 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q5_K`   | Q5_K         | 5.5 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q5_K_S` | Q5_K_S       | 5.3 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q6_K`   | Q6_K         | 6.3 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:Q8_0`   | Q8_0         | 8.2 GB          |
+| `shaowenchen/colossal-llama-2-7b-base-gguf:full`   | full         | 14 GB           |
 Usage: