Update README.md
Browse files
README.md
CHANGED
@@ -136,7 +136,7 @@ All inferences run on 2 RTX A6000 GPUs (using `vllm`, with a tensor-parallel siz
|
|
136 |
|
137 |
| Models | Inference Time (sec)|Estimated Max Input Length (Char)|
|
138 |
|--------------------------------------------------------------------|-------------------|--------------------------|
|
139 |
-
| Yi-6B | 10.62 |
|
140 |
| **Breeze-7B-Instruct-v0.1** | 10.74 | 11.1k |
|
141 |
| **Breeze-7B-Instruct-64k-v0.1** | 10.74 | 88.8k |
|
142 |
| Qwen-7B | 10.86 | 9.8k |
|
|
|
136 |
|
137 |
| Models | Inference Time (sec)|Estimated Max Input Length (Char)|
|
138 |
|--------------------------------------------------------------------|-------------------|--------------------------|
|
139 |
+
| Yi-6B | 10.62 | 4.5k |
|
140 |
| **Breeze-7B-Instruct-v0.1** | 10.74 | 11.1k |
|
141 |
| **Breeze-7B-Instruct-64k-v0.1** | 10.74 | 88.8k |
|
142 |
| Qwen-7B | 10.86 | 9.8k |
|