DevsDoCode commited on
Commit
f819928
·
verified ·
1 Parent(s): a1b9191

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -24
README.md CHANGED
@@ -1,9 +1,8 @@
1
  ---
2
- language:
3
- - en
4
- license: apache-2.0
5
  library_name: transformers
6
  tags:
 
 
7
  - uncensored
8
  - transformers
9
  - llama
@@ -11,35 +10,66 @@ tags:
11
  - unsloth
12
  - llama-cpp
13
  - gguf-my-repo
 
 
 
14
  pipeline_tag: text-generation
15
  ---
16
 
17
- # DevsDoCode/LLama-3-8b-Uncensored-Q3_K_M-GGUF
18
- This model was converted to GGUF format from [`DevsDoCode/LLama-3-8b-Uncensored`](https://huggingface.co/DevsDoCode/LLama-3-8b-Uncensored) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
- Refer to the [original model card](https://huggingface.co/DevsDoCode/LLama-3-8b-Uncensored) for more details on the model.
20
- ## Use with llama.cpp
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- Install llama.cpp through brew.
23
 
24
- ```bash
25
- brew install ggerganov/ggerganov/llama.cpp
26
- ```
27
- Invoke the llama.cpp server or the CLI.
28
 
29
- CLI:
30
 
31
- ```bash
32
- llama-cli --hf-repo DevsDoCode/LLama-3-8b-Uncensored-Q3_K_M-GGUF --model llama-3-8b-uncensored.Q3_K_M.gguf -p "The meaning to life and the universe is"
33
- ```
34
 
35
- Server:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- ```bash
38
- llama-server --hf-repo DevsDoCode/LLama-3-8b-Uncensored-Q3_K_M-GGUF --model llama-3-8b-uncensored.Q3_K_M.gguf -c 2048
39
- ```
40
 
41
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
42
 
43
- ```
44
- git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make && ./main -m llama-3-8b-uncensored.Q3_K_M.gguf -n 128
45
- ```
 
 
 
 
 
 
1
  ---
 
 
 
2
  library_name: transformers
3
  tags:
4
+ - llama-cpp
5
+ - gguf-my-repo
6
  - uncensored
7
  - transformers
8
  - llama
 
10
  - unsloth
11
  - llama-cpp
12
  - gguf-my-repo
13
+ language:
14
+ - en
15
+ license: apache-2.0
16
  pipeline_tag: text-generation
17
  ---
18
 
19
+ <div align="center">
20
+ <!-- Replace `#` with your actual links -->
21
+ <a href="https://youtube.com/@devsdocode"><img alt="YouTube" src="https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white"></a>
22
+ <a href="https://t.me/devsdocode"><img alt="Telegram" src="https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white"></a>
23
+ <a href="https://www.instagram.com/sree.shades_/"><img alt="Instagram" src="https://img.shields.io/badge/Instagram-E4405F?style=for-the-badge&logo=instagram&logoColor=white"></a>
24
+ <a href="https://www.linkedin.com/in/developer-sreejan/"><img alt="LinkedIn" src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white"></a>
25
+ <a href="https://buymeacoffee.com/devsdocode"><img alt="Buy Me A Coffee" src="https://img.shields.io/badge/Buy%20Me%20A%20Coffee-FFDD00?style=for-the-badge&logo=buymeacoffee&logoColor=black"></a>
26
+ </div>
27
+
28
+ ## Crafted with ❤️ by Devs Do Code (Sree)
29
+
30
+ ### GGUF Technical Specifications
31
+
32
+ Delve into the intricacies of GGUF, a meticulously crafted format that builds upon the robust foundation of the GGJT model. Tailored for heightened extensibility and user-centric functionality, GGUF introduces a suite of indispensable features:
33
+
34
+ **Single-file Deployment:** Streamline distribution and loading effortlessly. GGUF models have been meticulously architected for seamless deployment, necessitating no external files for supplementary information.
35
+
36
+ **Extensibility:** Safeguard the future of your models. GGUF seamlessly accommodates the integration of new features into GGML-based executors, ensuring compatibility with existing models.
37
+
38
+ **mmap Compatibility:** Prioritize efficiency. GGUF models are purposefully engineered to support mmap, facilitating rapid loading and saving, thus optimizing your workflow.
39
+
40
+ **User-Friendly:** Simplify your coding endeavors. Load and save models effortlessly, irrespective of the programming language used, obviating the dependency on external libraries.
41
 
42
+ **Full Information:** A comprehensive repository in a single file. GGUF models encapsulate all requisite information for loading, eliminating the need for users to furnish additional data.
43
 
44
+ The differentiator between GGJT and GGUF lies in the deliberate adoption of a key-value structure for hyperparameters (now termed metadata). Bid farewell to untyped lists, and embrace a structured approach that seamlessly accommodates new metadata without compromising compatibility with existing models. Augment your model with supplementary information for enhanced inference and model identification.
 
 
 
45
 
 
46
 
47
+ **QUANTIZATION_METHODS:**
 
 
48
 
49
+ | Method | Quantization | Advantages | Trade-offs |
50
+ |---|---|---|---|
51
+ | q2_k | 2-bit integers | Significant model size reduction | Minimal impact on accuracy |
52
+ | q3_k_l | 3-bit integers | Balance between model size reduction and accuracy preservation | Moderate impact on accuracy |
53
+ | q3_k_m | 3-bit integers | Enhanced accuracy with mixed precision | Increased computational complexity |
54
+ | q3_k_s | 3-bit integers | Improved model efficiency with structured pruning | Reduced accuracy |
55
+ | q4_0 | 4-bit integers | Significant model size reduction | Moderate impact on accuracy |
56
+ | q4_1 | 4-bit integers | Enhanced accuracy with mixed precision | Increased computational complexity |
57
+ | q4_k_m | 4-bit integers | Optimized model size and accuracy with mixed precision and structured pruning | Reduced accuracy |
58
+ | q4_k_s | 4-bit integers | Improved model efficiency with structured pruning | Reduced accuracy |
59
+ | q5_0 | 5-bit integers | Balance between model size reduction and accuracy preservation | Moderate impact on accuracy |
60
+ | q5_1 | 5-bit integers | Enhanced accuracy with mixed precision | Increased computational complexity |
61
+ | q5_k_m | 5-bit integers | Optimized model size and accuracy with mixed precision and structured pruning | Reduced accuracy |
62
+ | q5_k_s | 5-bit integers | Improved model efficiency with structured pruning | Reduced accuracy |
63
+ | q6_k | 6-bit integers | Balance between model size reduction and accuracy preservation | Moderate impact on accuracy |
64
+ | q8_0 | 8-bit integers | Significant model size reduction | Minimal impact on accuracy |
65
 
 
 
 
66
 
 
67
 
68
+ <div align="center">
69
+ <!-- Replace `#` with your actual links -->
70
+ <a href="https://youtube.com/@devsdocode"><img alt="YouTube" src="https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white"></a>
71
+ <a href="https://t.me/devsdocode"><img alt="Telegram" src="https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white"></a>
72
+ <a href="https://www.instagram.com/sree.shades_/"><img alt="Instagram" src="https://img.shields.io/badge/Instagram-E4405F?style=for-the-badge&logo=instagram&logoColor=white"></a>
73
+ <a href="https://www.linkedin.com/in/developer-sreejan/"><img alt="LinkedIn" src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white"></a>
74
+ <a href="https://buymeacoffee.com/devsdocode"><img alt="Buy Me A Coffee" src="https://img.shields.io/badge/Buy%20Me%20A%20Coffee-FFDD00?style=for-the-badge&logo=buymeacoffee&logoColor=black"></a>
75
+ </div>