--- license: mit license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE language: - en pipeline_tag: text-generation tags: - phi - phi4 - nlp - math - code - chat - conversational base_model: microsoft/phi-4 library_name: transformers --- # Phi-4 GPTQ (4-bit Quantized) [![Model](https://img.shields.io/badge/HuggingFace-Phi--4--GPTQ-orange)](https://huggingface.co/fhamborg/phi-4-4bit-gptq) ## Model Description This is a **4-bit quantized** version of the Phi-4 transformer model, optimized for **efficient inference** while maintaining performance. - **Base Model**: [Phi-4](https://huggingface.co/...) - **Quantization**: autoround and bnb (4-bit) - **Format**: `safetensors` - **Tokenizer**: Uses standard `vocab.json` and `merges.txt` ## Intended Use - Fast inference with minimal VRAM usage - Deployment in resource-constrained environments - Optimized for **low-latency text generation** ## Model Details | Attribute | Value | |-----------------|-------| | **Model Name** | Phi-4 GPTQ | | **Quantization** | 4-bit (GPTQ) | | **File Format** | `.safetensors` | | **Tokenizer** | `phi-4-tokenizer.json` | | **VRAM Usage** | ~X GB (depending on batch size) |