File size: 1,212 Bytes
1d49485
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1641df8
1d49485
 
1641df8
1d49485
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: mit
license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE
language:
- en
pipeline_tag: text-generation
tags:
- phi
- phi4
- nlp
- math
- code
- chat
- conversational
base_model: microsoft/phi-4
library_name: transformers
---
# Phi-4 GPTQ (4-bit Quantized)

[![Model](https://img.shields.io/badge/HuggingFace-Phi--4--GPTQ-orange)](https://huggingface.co/fhamborg/phi-4-4bit-gptq)

## Model Description
This is a **4-bit quantized** version of the Phi-4 transformer model, optimized for **efficient inference** while maintaining performance. 

- **Base Model**: [Phi-4](https://huggingface.co/...)  
- **Quantization**: autoround and bnb (4-bit)  
- **Format**: `safetensors`  
- **Tokenizer**: Uses standard `vocab.json` and `merges.txt`  

## Intended Use
- Fast inference with minimal VRAM usage  
- Deployment in resource-constrained environments  
- Optimized for **low-latency text generation**  

## Model Details
| Attribute        | Value |
|-----------------|-------|
| **Model Name**  | Phi-4 GPTQ  |
| **Quantization** | 4-bit (GPTQ) |
| **File Format** | `.safetensors` |
| **Tokenizer** | `phi-4-tokenizer.json` |
| **VRAM Usage** | ~X GB (depending on batch size) |