File size: 1,747 Bytes

3a58e69
 
 
 
 
 
 
 
 
 
 
 
 
 
ea1bcbd
3a58e69
 
 
 
ea1bcbd
3a58e69
 
 
ea1bcbd
3a58e69
 
 
 
 
 
 
 
 
 
ea1bcbd
3a58e69
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ea1bcbd
3a58e69
 
ea1bcbd
3a58e69
 
 
ea1bcbd
3a58e69
 
ea1bcbd

---
license: other
license_name: tongyi-qianwen
license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
pipeline_tag: text-generation
language:
- en
- zh
library_name: transformers
tags:
- mergekit
- llama
---

# IridiumLlama-72B-v0.1

## Model Description
FeatherLlama is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`.

This is converted from [leafspark/Iridium-72B-v0.1](https://huggingface.co/leafspark/Iridium-72B-v0.1)

## Features
- 72 billion parameters
- Sharded in 31 files (unlike Iridium, which has 963 shards due to the merging process)
- Combines Magnum prose with Calam smarts
- Llamaified for easy use

## Technical Specifications

### Architecture
- `LlamaForCasualLM`
- Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
- Merged layers: 80
- Total tensors: 1,043
- Context length: 32k

### Tensor Distribution
- Attention layers: 560 files
- MLP layers: 240 files
- Layer norms: 160 files
- Miscellaneous (embeddings, output): 83 files

### Merging
Custom script utilizing safetensors library.

## Usage

### Loading the Model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained("leafspark/IridiumLlama-72B-v0.1", 
                                             device_map="auto", 
                                             torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained("leafspark/IridiumLlama-72B-v0.1")
```
### GGUFs

Find them here: [leafspark/IridiumLlama-72B-v0.1-GGUF](https://huggingface.co/leafspark/IridiumLlama-72B-v0.1-GGUF)

### Hardware Requirements
- At least ~150GB of free space
- ~150GB VRAM