Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -80,7 +80,12 @@ print("Response:", output_text[0][len(prompt):])
 ```
 # Serving with vllm
-We can use the same command we used in serving benchmarks to serve the model with vllm
 ```
 vllm serve pytorch/Phi-4-mini-instruct-float8dq --tokenizer microsoft/Phi-4-mini-instruct -O3
 ```

 ```
 # Serving with vllm
+First install vllm nightly:
+```
+pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
+```
+Then we can serve with the following command:
 ```
 vllm serve pytorch/Phi-4-mini-instruct-float8dq --tokenizer microsoft/Phi-4-mini-instruct -O3
 ```