fhamborg's picture
Update README.md
1641df8 verified
---
license: mit
license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE
language:
- en
pipeline_tag: text-generation
tags:
- phi
- phi4
- nlp
- math
- code
- chat
- conversational
base_model: microsoft/phi-4
library_name: transformers
---
# Phi-4 GPTQ (4-bit Quantized)
[![Model](https://img.shields.io/badge/HuggingFace-Phi--4--GPTQ-orange)](https://huggingface.co/fhamborg/phi-4-4bit-gptq)
## Model Description
This is a **4-bit quantized** version of the Phi-4 transformer model, optimized for **efficient inference** while maintaining performance.
- **Base Model**: [Phi-4](https://huggingface.co/...)
- **Quantization**: autoround and bnb (4-bit)
- **Format**: `safetensors`
- **Tokenizer**: Uses standard `vocab.json` and `merges.txt`
## Intended Use
- Fast inference with minimal VRAM usage
- Deployment in resource-constrained environments
- Optimized for **low-latency text generation**
## Model Details
| Attribute | Value |
|-----------------|-------|
| **Model Name** | Phi-4 GPTQ |
| **Quantization** | 4-bit (GPTQ) |
| **File Format** | `.safetensors` |
| **Tokenizer** | `phi-4-tokenizer.json` |
| **VRAM Usage** | ~X GB (depending on batch size) |