jerryzh168 commited on
Commit
e3d011c
·
verified ·
1 Parent(s): 1170503

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -210,10 +210,6 @@ and decode tokens per second will be more important than time to first token.
210
  Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
211
  Int4 weight only is optimized for batch size 1 and short input and output token length, please stay tuned for models optimized for larger batch sizes or longer token length.
212
 
213
- ## Download dataset
214
- Download sharegpt dataset: `wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json`
215
-
216
- Other datasets can be found in: https://github.com/vllm-project/vllm/tree/main/benchmarks
217
  ## benchmark_latency
218
 
219
  Need to install vllm nightly to get some recent changes
@@ -242,8 +238,15 @@ python benchmarks/benchmark_latency.py --input-len 256 --output-len 256 --model
242
 
243
  We also benchmarked the throughput in a serving environment.
244
 
 
 
245
 
246
- Run the following under `vllm` source code root folder:
 
 
 
 
 
247
 
248
  ### baseline
249
  Server:
 
210
  Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
211
  Int4 weight only is optimized for batch size 1 and short input and output token length, please stay tuned for models optimized for larger batch sizes or longer token length.
212
 
 
 
 
 
213
  ## benchmark_latency
214
 
215
  Need to install vllm nightly to get some recent changes
 
238
 
239
  We also benchmarked the throughput in a serving environment.
240
 
241
+ Download sharegpt dataset: `wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json`
242
+ Other datasets can be found in: https://github.com/vllm-project/vllm/tree/main/benchmarks
243
 
244
+ Get vllm source code:
245
+ ```
246
+ git clone [email protected]:vllm-project/vllm.git
247
+ ```
248
+
249
+ Run the following under `vllm` root folder:
250
 
251
  ### baseline
252
  Server: