File size: 1,212 Bytes
1d49485 1641df8 1d49485 1641df8 1d49485 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
---
license: mit
license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE
language:
- en
pipeline_tag: text-generation
tags:
- phi
- phi4
- nlp
- math
- code
- chat
- conversational
base_model: microsoft/phi-4
library_name: transformers
---
# Phi-4 GPTQ (4-bit Quantized)
[](https://huggingface.co/fhamborg/phi-4-4bit-gptq)
## Model Description
This is a **4-bit quantized** version of the Phi-4 transformer model, optimized for **efficient inference** while maintaining performance.
- **Base Model**: [Phi-4](https://huggingface.co/...)
- **Quantization**: autoround and bnb (4-bit)
- **Format**: `safetensors`
- **Tokenizer**: Uses standard `vocab.json` and `merges.txt`
## Intended Use
- Fast inference with minimal VRAM usage
- Deployment in resource-constrained environments
- Optimized for **low-latency text generation**
## Model Details
| Attribute | Value |
|-----------------|-------|
| **Model Name** | Phi-4 GPTQ |
| **Quantization** | 4-bit (GPTQ) |
| **File Format** | `.safetensors` |
| **Tokenizer** | `phi-4-tokenizer.json` |
| **VRAM Usage** | ~X GB (depending on batch size) |
|