jerryzh168 commited on
Commit
9af0683
·
verified ·
1 Parent(s): d1aeb1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -6
README.md CHANGED
@@ -150,12 +150,6 @@ Our int4wo is only optimized for batch size 1, so we'll only benchmark the batch
150
  Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
151
  Int4 weight only is optimized for batch size 1 and short input and output token length, please stay tuned for models optimized for larger batch sizes or longer token length.
152
 
153
- ## Download vllm source code and install vllm
154
- ```
155
- git clone [email protected]:vllm-project/vllm.git
156
- VLLM_USE_PRECOMPILED=1 pip install .
157
- ```
158
-
159
  ## Download dataset
160
  Download sharegpt dataset: `wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json`
161
 
 
150
  Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
151
  Int4 weight only is optimized for batch size 1 and short input and output token length, please stay tuned for models optimized for larger batch sizes or longer token length.
152
 
 
 
 
 
 
 
153
  ## Download dataset
154
  Download sharegpt dataset: `wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json`
155