jerryzh168 commited on
Commit
451b943
·
verified ·
1 Parent(s): e60e06a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -80,7 +80,12 @@ print("Response:", output_text[0][len(prompt):])
80
  ```
81
 
82
  # Serving with vllm
83
- We can use the same command we used in serving benchmarks to serve the model with vllm
 
 
 
 
 
84
  ```
85
  vllm serve pytorch/Phi-4-mini-instruct-float8dq --tokenizer microsoft/Phi-4-mini-instruct -O3
86
  ```
 
80
  ```
81
 
82
  # Serving with vllm
83
+ First install vllm nightly:
84
+ ```
85
+ pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
86
+ ```
87
+
88
+ Then we can serve with the following command:
89
  ```
90
  vllm serve pytorch/Phi-4-mini-instruct-float8dq --tokenizer microsoft/Phi-4-mini-instruct -O3
91
  ```