fhamborg commited on
Commit
1d49485
·
verified ·
1 Parent(s): 3da946c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE
4
+ language:
5
+ - en
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - phi
9
+ - phi4
10
+ - nlp
11
+ - math
12
+ - code
13
+ - chat
14
+ - conversational
15
+ base_model: microsoft/phi-4
16
+ library_name: transformers
17
+ ---
18
+ # Phi-4 GPTQ (4-bit Quantized)
19
+
20
+ [![Model](https://img.shields.io/badge/HuggingFace-Phi--4--GPTQ-orange)](https://huggingface.co/fhamborg/phi-4-4bit-gptq)
21
+
22
+ ## Model Description
23
+ This is a **4-bit GPTQ-quantized** version of the Phi-4 transformer model, optimized for **efficient inference** while maintaining performance.
24
+
25
+ - **Base Model**: [Phi-4](https://huggingface.co/...)
26
+ - **Quantization**: GPTQ (4-bit)
27
+ - **Format**: `safetensors`
28
+ - **Tokenizer**: Uses standard `vocab.json` and `merges.txt`
29
+
30
+ ## Intended Use
31
+ - Fast inference with minimal VRAM usage
32
+ - Deployment in resource-constrained environments
33
+ - Optimized for **low-latency text generation**
34
+
35
+ ## Model Details
36
+ | Attribute | Value |
37
+ |-----------------|-------|
38
+ | **Model Name** | Phi-4 GPTQ |
39
+ | **Quantization** | 4-bit (GPTQ) |
40
+ | **File Format** | `.safetensors` |
41
+ | **Tokenizer** | `phi-4-tokenizer.json` |
42
+ | **VRAM Usage** | ~X GB (depending on batch size) |