Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,8 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
tags:
|
|
|
|
|
4 |
---
|
5 |
|
6 |
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) model quantized with [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) int4 weight only quantization, by PyTorch team.
|
@@ -146,4 +148,4 @@ python benchmarks/benchmark_serving.py --backend vllm --dataset-name sharegpt --
|
|
146 |
We can use the same command we used in serving benchmarks to serve the model with vllm
|
147 |
```
|
148 |
vllm serve jerryzh168/phi4-mini-int4wo-hqq --tokenizer microsoft/Phi-4-mini-instruct -O3
|
149 |
-
```
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
tags:
|
4 |
+
- torchao
|
5 |
+
license: mit
|
6 |
---
|
7 |
|
8 |
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) model quantized with [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) int4 weight only quantization, by PyTorch team.
|
|
|
148 |
We can use the same command we used in serving benchmarks to serve the model with vllm
|
149 |
```
|
150 |
vllm serve jerryzh168/phi4-mini-int4wo-hqq --tokenizer microsoft/Phi-4-mini-instruct -O3
|
151 |
+
```
|