Update README.md
Browse files
README.md
CHANGED
@@ -284,16 +284,16 @@ print(f"Peak Memory Usage: {mem:.02f} GB")
|
|
284 |
Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
|
285 |
|
286 |
## Setup
|
287 |
-
Need to install vllm nightly to get some recent changes
|
288 |
-
```Shell
|
289 |
-
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
290 |
-
```
|
291 |
-
|
292 |
Get vllm source code:
|
293 |
```Shell
|
294 |
git clone [email protected]:vllm-project/vllm.git
|
295 |
```
|
296 |
|
|
|
|
|
|
|
|
|
|
|
297 |
Run the benchmarks under `vllm` root folder:
|
298 |
|
299 |
## benchmark_latency
|
|
|
284 |
Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
|
285 |
|
286 |
## Setup
|
|
|
|
|
|
|
|
|
|
|
287 |
Get vllm source code:
|
288 |
```Shell
|
289 |
git clone [email protected]:vllm-project/vllm.git
|
290 |
```
|
291 |
|
292 |
+
Install vllm
|
293 |
+
```
|
294 |
+
VLLM_USE_PRECOMPILED=1 pip install --editable .
|
295 |
+
```
|
296 |
+
|
297 |
Run the benchmarks under `vllm` root folder:
|
298 |
|
299 |
## benchmark_latency
|