EmbeddedLLM
/

Phi-3-vision-128k-instruct-onnx

Text Generation

Model card Files Files and versions

tjellm commited on Jun 20, 2024

Commit

1c9319a

·

verified ·

1 Parent(s): ad345a0

Update README.md

Files changed (1) hide show

README.md +23 -3

README.md CHANGED Viewed

@@ -20,6 +20,26 @@ tags:
 inference: false
 ---
 # Phi-3-vision-128k-instruct ONNX
 This repository hosts the optimized versions of [microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/) to accelerate inference with DirectML and ONNX Runtime.
@@ -78,14 +98,14 @@ pip install huggingface-hub[cli]
 4. **Download the model:**
 ```sh
-huggingface-cli download EmbeddedLLM/Phi-3-vision-128k-instruct-onnx --include="onnx/directml/*" --local-dir .\Phi-3-vision-128k-instruct
 ```
 5. **Install necessary Python packages:**
 ```sh
 pip install numpy==1.26.4
-pip install onnxruntime-directml
-pip install --pre onnxruntime-genai-directml
 ```
 6. **Install Visual Studio 2015 runtime:**

 inference: false
 ---
+# Phi-3-vision-128k-instruct ONNX models for CPU and CUDA
+This repository hosts the optimized versions of [microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/) to accelerate inference with ONNX Runtime.
+This repository is a clone from [microsoft/Phi-3-vision-128k-instruct-onnx-cpu](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-cpu), with extra files necessary for deploying the model with OpenAI-API-Compatible endpoints through [`embeddedllm`](https://github.com/EmbeddedLLM/embeddedllm) pypi library.
+## Usage on Windows (Intel / AMD / Nvidia / Qualcomm)
+```powershell
+conda create -n onnx python=3.10
+conda activate onnx
+winget install -e --id GitHub.GitLFS
+pip install huggingface-hub[cli]
+huggingface-cli download EmbeddedLLM/Phi-3-vision-128k-instruct-onnx --include='onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4' --local-dir .\Phi-3-vision-128k-instruct-onnx
+pip install numpy==1.26.4
+Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3v.py" -OutFile "phi3v.py"
+pip install onnxruntime
+pip install --pre onnxruntime-genai==0.3.0rc2
+python phi3v.py -m .\Phi-3-vision-128k-instruct-onnx
+```
+# UPSTREAM README.md
 # Phi-3-vision-128k-instruct ONNX
 This repository hosts the optimized versions of [microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/) to accelerate inference with DirectML and ONNX Runtime.
 4. **Download the model:**
 ```sh
+huggingface-cli download EmbeddedLLM/Phi-3-vision-128k-instruct-onnx --include="onnx/cpu_and_mobile/*" --local-dir .\Phi-3-vision-128k-instruct
 ```
 5. **Install necessary Python packages:**
 ```sh
 pip install numpy==1.26.4
+pip install onnxruntime
+pip install --pre onnxruntime-genai==0.3.0rc2
 ```
 6. **Install Visual Studio 2015 runtime:**