|
--- |
|
license: mit |
|
license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- phi |
|
- phi4 |
|
- nlp |
|
- math |
|
- code |
|
- chat |
|
- conversational |
|
base_model: microsoft/phi-4 |
|
library_name: transformers |
|
--- |
|
# Phi-4 GPTQ (4-bit Quantized) |
|
|
|
[](https://huggingface.co/fhamborg/phi-4-4bit-gptq) |
|
|
|
## Model Description |
|
This is a **4-bit quantized** version of the Phi-4 transformer model, optimized for **efficient inference** while maintaining performance. |
|
|
|
- **Base Model**: [Phi-4](https://huggingface.co/...) |
|
- **Quantization**: autoround and bnb (4-bit) |
|
- **Format**: `safetensors` |
|
- **Tokenizer**: Uses standard `vocab.json` and `merges.txt` |
|
|
|
## Intended Use |
|
- Fast inference with minimal VRAM usage |
|
- Deployment in resource-constrained environments |
|
- Optimized for **low-latency text generation** |
|
|
|
## Model Details |
|
| Attribute | Value | |
|
|-----------------|-------| |
|
| **Model Name** | Phi-4 GPTQ | |
|
| **Quantization** | 4-bit (GPTQ) | |
|
| **File Format** | `.safetensors` | |
|
| **Tokenizer** | `phi-4-tokenizer.json` | |
|
| **VRAM Usage** | ~X GB (depending on batch size) | |
|
|