Update README.md
Browse files
README.md
CHANGED
@@ -15,13 +15,13 @@ tags:
|
|
15 |
- phi-4-mini
|
16 |
---
|
17 |
|
18 |
-
##
|
19 |
|
20 |
### Introduction
|
21 |
|
22 |
-
ONNX version of
|
23 |
|
24 |
-
This
|
25 |
|
26 |
To run this model with ONNX Runtime:
|
27 |
|
@@ -40,13 +40,12 @@ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/ma
|
|
40 |
Run the script
|
41 |
|
42 |
```bash
|
43 |
-
python phi4-mm.py -m Phi-4-multimodal-instruct-onnx/
|
44 |
```
|
45 |
|
46 |
-
You will be prompted
|
47 |
|
48 |
-
|
49 |
-
The performance of the text component is similar the [phi4 mini ONNX models] (https://huggingface.co/microsoft/Phi-4-mini-instruct-onnx/blob/main/README.md)
|
50 |
|
51 |
### Model Description
|
52 |
|
|
|
15 |
- phi-4-mini
|
16 |
---
|
17 |
|
18 |
+
## Phi-4 Multimodal Instruct ONNX models
|
19 |
|
20 |
### Introduction
|
21 |
|
22 |
+
This is an ONNX version of the Phi-4 multimodal model to accelerate inference with ONNX Runtime.
|
23 |
|
24 |
+
This model is quantized to int4 precision and runs on GPU devices.
|
25 |
|
26 |
To run this model with ONNX Runtime:
|
27 |
|
|
|
40 |
Run the script
|
41 |
|
42 |
```bash
|
43 |
+
python phi4-mm.py -m Phi-4-multimodal-instruct-onnx/gpu/gpu-int4-rtn-block-32 -e cuda
|
44 |
```
|
45 |
|
46 |
+
You will be prompted to provide any images, audios, and a prompt.
|
47 |
|
48 |
+
The performance of the text component is similar to the [Phi-4 mini ONNX models] (https://huggingface.co/microsoft/Phi-4-mini-instruct-onnx/blob/main/README.md)
|
|
|
49 |
|
50 |
### Model Description
|
51 |
|