|
--- |
|
license: other |
|
license_name: tongyi-qianwen |
|
license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE |
|
pipeline_tag: text-generation |
|
language: |
|
- en |
|
- zh |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- llama |
|
--- |
|
|
|
# IridiumLlama-72B-v0.1 |
|
|
|
## Model Description |
|
IridiumLlama is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`. |
|
|
|
This is converted from [leafspark/Iridium-72B-v0.1](https://huggingface.co/leafspark/Iridium-72B-v0.1) (currently private) |
|
|
|
## Features |
|
- 72 billion parameters |
|
- Sharded in 31 files (unlike Iridium, which has 963 shards due to the merging process) |
|
- Combines Magnum prose with Calam smarts |
|
- Llamaified for easy use |
|
|
|
## Technical Specifications |
|
|
|
### Architecture |
|
- `LlamaForCasualLM` |
|
- Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1 |
|
- Merged layers: 80 |
|
- Total tensors: 1,043 |
|
- Context length: 32k |
|
|
|
### Tensor Distribution |
|
- Attention layers: 560 files |
|
- MLP layers: 240 files |
|
- Layer norms: 160 files |
|
- Miscellaneous (embeddings, output): 162 files |
|
|
|
### Merging |
|
Custom script utilizing safetensors library. |
|
|
|
## Usage |
|
|
|
### Loading the Model |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model = AutoModelForCausalLM.from_pretrained("leafspark/IridiumLlama-72B-v0.1", |
|
device_map="auto", |
|
torch_dtype=torch.float16) |
|
tokenizer = AutoTokenizer.from_pretrained("leafspark/IridiumLlama-72B-v0.1") |
|
``` |
|
### GGUFs |
|
|
|
Find them here: [mradermacher/IridiumLlama-72B-v0.1-GGUF](https://huggingface.co/mradermacher/IridiumLlama-72B-v0.1-GGUF) |
|
|
|
### Hardware Requirements |
|
- At least ~150GB of free space |
|
- ~150GB VRAM |