claysauruswrecks commited on
Commit
0453cf0
·
1 Parent(s): 7f75291

add README.md, quantized gguf models

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.gguf filter=lfs diff=lfs merge=lfs -text
Giraffe-v2-13b-32k.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c5286041991b7b21e310e66cd6a89d465e82787b7431a47c74519b1ce477b92
3
+ size 7865956224
Giraffe-v2-13b-32k.Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5cff7a6e94a6da58a09ed028ae90f9856b6a1440c7900fefd7f3d5c7efd6bc4e
3
+ size 9229924224
Giraffe-v2-13b-32k.Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83f12f6bd1df60f958448b46526af888347b7403c7efd89c8b4a1a63051279d7
3
+ size 13831319424
README.md CHANGED
@@ -1,3 +1,118 @@
1
  ---
 
 
 
 
 
2
  license: llama2
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: abacusai/Giraffe-v2-13b-32k
3
+ inference: false
4
+ language:
5
+ - en
6
+ library_name: transformers
7
  license: llama2
8
+ model_creator: Abacus.AI
9
+ model_name: Giraffe v2 7B 32K GGUF
10
+ model_type: llama2
11
+ quantized_by: claysauruswrecks
12
  ---
13
+
14
+ # Giraffe v2 7B 32K - GGUF
15
+
16
+ - Model creator: [Abacus.AI](https://huggingface.co/abacusai/)
17
+ - Original model: [Giraffe v2 7B 32K](https://huggingface.co/abacusai/Giraffe-v2-13b-32k)
18
+
19
+ <!-- description start -->
20
+ ## Description
21
+
22
+ This repo contains GGUF format model files for [Abacus.AI's Giraffe v2 7B 32K](https://huggingface.co/abacusai/Giraffe-v2-13b-32k)
23
+
24
+ These files were quantized on an Intel i9-9980HK with 32GB RAM.
25
+
26
+ <!-- description end -->
27
+ <!-- README_GGUF.md-about-gguf start -->
28
+ ### About GGUF
29
+
30
+ GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
31
+
32
+ Here is an incomplete list of clients and libraries that are known to support GGUF:
33
+
34
+ - [llama.cpp](https://github.com/ggerganov/llama.cpp). The source project for GGUF. Offers a CLI and a server option.
35
+ - [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
36
+ - [KoboldCpp](https://github.com/LostRuins/koboldcpp), a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
37
+ - [LM Studio](https://lmstudio.ai/), an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration.
38
+ - [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
39
+ - [Faraday.dev](https://faraday.dev/), an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
40
+ - [ctransformers](https://github.com/marella/ctransformers), a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server.
41
+ - [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
42
+ - [candle](https://github.com/huggingface/candle), a Rust ML framework with a focus on performance, including GPU support, and ease of use.
43
+ <!-- README_GGUF.md-about-gguf end -->
44
+
45
+ <!-- compatibility_gguf start -->
46
+ ## Compatibility
47
+
48
+ These quantized GGUFv2 files are compatible with llama.cpp from November 1st, 2023 onwards, as of commit [c43c2da](https://github.com/ggerganov/llama.cpp/commit/c43c2da8afacaddfe51c09b21dbd9922cd0ea46b)
49
+
50
+ They are also compatible with many third party UIs and libraries - please see the list at the top of this README.
51
+ <!-- compatibility_gguf end -->
52
+
53
+ ## Usage
54
+
55
+ I have only tested these in `text-generation-webui` with `ctx=2048` and `compress_pos_emb` from `4` through `8`.
56
+
57
+ TODO: Make sure the longer context is actually functional.
58
+
59
+ <!-- README_GGUF.md-provided-files start -->
60
+ ## Provided files
61
+
62
+ | Name | Quant method | Bits | Size | Use case |
63
+ | ---- | ---- | ---- | ---- | ---- |
64
+ | [giraffe-v2-13b-32k.Q4_K_M.gguf](https://huggingface.co/claysauruswrecks/Giraffe-v2-13b-32k-GGUF/blob/main/giraffe-v2-13b-32k.Q4_K_M.gguf) | Q4_K_M | 4 | 7.4 GB| medium, balanced quality - recommended |
65
+ | [giraffe-v2-13b-32k.Q5_K_M.gguf](https://huggingface.co/claysauruswrecks/Giraffe-v2-13b-32k-GGUF/blob/main/giraffe-v2-13b-32k.Q5_K_M.gguf) | Q5_K_M | 5 | 8.6 GB| large, very low quality loss - recommended |
66
+ | [giraffe-v2-13b-32k.Q8_0.gguf](https://huggingface.co/claysauruswrecks/Giraffe-v2-13b-32k-GGUF/blob/main/giraffe-v2-13b-32k.Q8_0.gguf) | Q8_0 | 8 | 13.0 GB | very large, extremely low quality loss - not recommended unless as a treat |
67
+
68
+ <!-- README_GGUF.md-provided-files end -->
69
+
70
+ <!-- footer start -->
71
+ Thanks to [TheBloke](https://huggingface.co/TheBloke) for the README.md template.
72
+ <!-- footer end -->
73
+
74
+ <!-- original-model-card start -->
75
+ ## Original model card: Abacus.AI's Giraffe v2 13B 7B 32K
76
+
77
+ ### Model Card: Giraffe-v2-13b-32k
78
+
79
+ #### Model Details
80
+
81
+ #### Model Description
82
+
83
+ We have followed up on our previous training runs related to extending the context length
84
+ of Llama models. The associated github repository
85
+
86
+ https://github.com/abacusai/long-context
87
+
88
+ has some basic details on our approach and metrics. We have also published a paper on arXiv
89
+ that covers our experiments and analysis a lot more comprehensively.
90
+
91
+ http://arxiv.org/abs/2308.10882
92
+
93
+ - **Developed by:** [Abacus.AI](https://abacus.ai)
94
+ - **Model type:** Transformer based autoregressive causal language model
95
+ - **License:** Llama 2 Community License: https://github.com/facebookresearch/llama/blob/main/LICENSE
96
+ - **Finetuned from model:** Llama V2 13B
97
+
98
+ ### Usage
99
+
100
+ To use this model at longer lengths the model needs to be patched to interpolate the longer context
101
+ lengths. It will not work if it is simply loaded with the `AutoModel` framework of `transformers`.
102
+ For full details and usage see:
103
+
104
+ https://github.com/abacusai/Long-Context
105
+
106
+ The evaluation section has detailed code for how to load and patch the model for inference (or further fine-tuning).
107
+ Note in particular the `max_position_embeddings` is not relevant since the patched module dynamically reallocates
108
+ the position buffers as required.
109
+
110
+ The tokenizer corresponding to this model is https://huggingface.co/abacusai/Giraffe-v1-Tokenizer.
111
+
112
+ Using the code in the repository you can load this model with the following code:
113
+ ```python
114
+ from models import load_model, load_tokenizer
115
+ tokenizer = load_tokenizer()
116
+ model = load_model('abacusai/Giraffe-v2-13b-32k', scale=8)
117
+ ```
118
+ <!-- original-model-card end -->