microsoft
/

Phi-4-multimodal-instruct-onnx

Automatic Speech Recognition

speech-summarization

speech-translation

visual-question-answering

phi-4-multimodal

Model card Files Files and versions

kvaishnavi commited on Feb 27

Commit

5ae86ff

·

verified ·

1 Parent(s): 29e40c6

Update README.md

Files changed (1) hide show

README.md +34 -11

README.md CHANGED Viewed

@@ -19,28 +19,51 @@ tags:
 ### Introduction
-This is an ONNX version of the Phi-4 multimodal model to accelerate inference with ONNX Runtime.
-This model is quantized to int4 precision and runs on GPU devices.
-To run this model with ONNX Runtime:
-Download the model:
-```bash
-git clone https://huggingface.co/microsoft/Phi-4-multimodal-instruct-onnx
-```
-Download the script to run the model:
 ```bash
-curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/phi4-mm.py -o phi4-mm.py
 ```
-Run the script
 ```bash
-python phi4-mm.py -m Phi-4-multimodal-instruct-onnx/gpu/gpu-int4-rtn-block-32 -e cuda
 ```
 You will be prompted to provide any images, audios, and a prompt.

 ### Introduction
+This is an ONNX version of the Phi-4 multimodal model that is quantized to int4 precision to accelerate inference with ONNX Runtime.
+## Model Run
+You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-4-multi-modal.md).
+For CPU: stay tuned!
+<!-- ```bash
+# Download the model directly using the Hugging Face CLI
+huggingface-cli download microsoft/Phi-4-multimodal-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir .
+# Install the CPU package of ONNX Runtime GenAI
+pip install --pre onnxruntime-genai
+# Please adjust the model directory (-m) accordingly
+curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi4-mm.py -o phi4-mm.py
+python phi4-mm.py -m cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4 -e cpu
+``` -->
+For CUDA:
 ```bash
+# Download the model directly using the Hugging Face CLI
+huggingface-cli download microsoft/Phi-4-multimodal-instruct-onnx --include gpu/* --local-dir .
+# Install the CUDA package of ONNX Runtime GenAI
+pip install --pre onnxruntime-genai-cuda
+# Please adjust the model directory (-m) accordingly
+curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi4-mm.py -o phi4-mm.py
+python phi4-mm.py -m gpu/gpu-int4-rtn-block-32 -e cuda
 ```
+For DirectML:
 ```bash
+# Download the model directly using the Hugging Face CLI
+huggingface-cli download microsoft/Phi-4-multimodal-instruct-onnx --include gpu/* --local-dir .
+# Install the DML package of ONNX Runtime GenAI
+pip install --pre onnxruntime-genai-directml
+# Please adjust the model directory (-m) accordingly
+curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi4-mm.py -o phi4-mm.py
+python phi4-mm.py -m gpu/gpu-int4-rtn-block-32 -e dml
 ```
 You will be prompted to provide any images, audios, and a prompt.