README.md · fhamborg/phi-4-4bit-autoround-bnb at main

phi-4-4bit-autoround-bnb / README.md

Update README.md

1641df8 verified 28 days ago

1.21 kB

	---
	license: mit
	license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- phi
	- phi4
	- nlp
	- math
	- code
	- chat
	- conversational
	base_model: microsoft/phi-4
	library_name: transformers
	---
	# Phi-4 GPTQ (4-bit Quantized)

	[![Model](https://img.shields.io/badge/HuggingFace-Phi--4--GPTQ-orange)](https://huggingface.co/fhamborg/phi-4-4bit-gptq)

	## Model Description
	This is a 4-bit quantized version of the Phi-4 transformer model, optimized for efficient inference while maintaining performance.

	- Base Model: [Phi-4](https://huggingface.co/...)
	- Quantization: autoround and bnb (4-bit)
	- Format: `safetensors`
	- Tokenizer: Uses standard `vocab.json` and `merges.txt`

	## Intended Use
	- Fast inference with minimal VRAM usage
	- Deployment in resource-constrained environments
	- Optimized for low-latency text generation

	## Model Details
	\| Attribute \| Value \|
	\|-----------------\|-------\|
	\| Model Name \| Phi-4 GPTQ \|
	\| Quantization \| 4-bit (GPTQ) \|
	\| File Format \| `.safetensors` \|
	\| Tokenizer \| `phi-4-tokenizer.json` \|
	\| VRAM Usage \| ~X GB (depending on batch size) \|