File size: 2,838 Bytes
3e91026 c3e0b8a a77d6da 3e91026 51656b0 ef7625f 51656b0 7ff8d56 93b14ea 32aeca5 51656b0 da119a6 51656b0 809c22c 51656b0 45f79b3 51656b0 45f79b3 9c6e7a0 45f79b3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
---
license:
- cc-by-nc-4.0
- llama2
language:
- en
library_name: ExLlamaV2
pipeline_tag: text-generation
tags:
- Mytho
- ReMM
- LLaMA 2
- Quantized Model
- exl2
base_model:
- Undi95/ReMM-v2.2-L2-13B
---
# exl2 quants for ReMM V2.2
This repository includes the quantized models for the [ReMM V2.2](https://huggingface.co/Undi95/ReMM-v2.2-L2-13B) model by [Undi](https://huggingface.co/Undi95). ReMM is a model merge attempting to recreate [MythoMax](https://huggingface.co/Gryphe/MythoMax-L2-13b) using the [SLERP](https://github.com/Undi95/LLM-SLERP-MergeTest) merging method and newer models.
## Current models
| exl2 Quant | Model Branch | Model Size | Minimum Recommended VRAM (4096 Context, fp16 cache) | BPW |
|-|-|-|-|-|
| 3-Bit | main | 5.44 GB | 8GB GPU | 3.14 |
| 3-Bit | 3bit | 6.36 GB | 10GB GPU | 3.72 |
| 4-Bit | 4bit | 7.13 GB | 12GB GPU (10GB with swap) | 4.2 |
| 4-Bit | 4.6bit | 7.81 GB | 12GB GPU | 4.63 |
| 5-Bit | [R136a1's Repo](https://huggingface.co/R136a1/ReMM-v2.2-L2-13B-exl2) | 8.96 GB | 16GB GPU (12GB with swap) | 5.33 |
## Where to use
There are a couple places you can use an exl2 model, here are a few:
- [tabbyAPI](https://github.com/theroyallab/tabbyAPI)
- [Aphrodite Engine](https://github.com/PygmalionAI/aphrodite-engine)
- [ExUI](https://github.com/turboderp/exui)
- [oobabooga's Text Gen Webui](https://github.com/oobabooga/text-generation-webui)
- When using the downloader, make sure to format like this: Anthonyg5005/ReMM-v2.2-L2-13B-exl2**\:QuantBranch**
- With 5-Bit download: [R136a1/ReMM-v2.2-L2-13B-exl2](https://huggingface.co/R136a1/ReMM-v2.2-L2-13B-exl2)
- [KoboldAI](https://github.com/henk717/KoboldAI) (Clone repo, don't use snapshot)
# How to download:
### oobabooga's downloader
use something like [download-model.py](https://github.com/oobabooga/text-generation-webui/blob/main/download-model.py) to download with python requests.\
Install requirements:
```shell
pip install requests tqdm
```
Example for downloading 3bpw:
```shell
python download-model.py Anthonyg5005/ReMM-v2.2-L2-13B-exl2:3bit
```
### huggingface-cli
You may also use huggingface-cli\
To install it, install python hf-hub
```shell
pip install huggingface-hub
```
Example for 3bpw:
```shell
huggingface-cli download Anthonyg5005/ReMM-v2.2-L2-13B-exl2 --local-dir ReMM-v2.2-L2-13B-exl2-3bpw --revision 3bit
```
### Git LFS (not recommended)
I would recommend the http downloaders over using git, they can resume downloads if failed and are much easier to work with.\
Make sure to have git and git LFS installed.\
Example for 3bpw download with git:
Have LFS file skip disabled
```shell
# windows
set GIT_LFS_SKIP_SMUDGE=0
# linux
export GIT_LFS_SKIP_SMUDGE=0
```
Clone repo branch
```shell
git clone https://huggingface.co/Anthonyg5005/ReMM-v2.2-L2-13B-exl2 -b 3bit
```
|