MediaTek-Research
/

Breeze-7B-Instruct-v0_1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cllatMTK commited on Jan 11, 2024

Commit

88c0b10

·

verified ·

1 Parent(s): cbfb065

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -136,7 +136,7 @@ All inferences run on 2 RTX A6000 GPUs (using `vllm`, with a tensor-parallel siz
 | Models                                                             | Inference Time (sec)|Estimated Max Input Length (Char)|
 |--------------------------------------------------------------------|-------------------|--------------------------|
-| Yi-6B                                                        |   10.62  |   5.2k                |
 | **Breeze-7B-Instruct-v0.1**                              |  10.74  |    11.1k                 |
 | **Breeze-7B-Instruct-64k-v0.1**                              | 10.74       |  88.8k            |
 | Qwen-7B                                                       |   10.86         |    9.8k                  |

 | Models                                                             | Inference Time (sec)|Estimated Max Input Length (Char)|
 |--------------------------------------------------------------------|-------------------|--------------------------|
+| Yi-6B                                                        |   10.62  |   4.5k                |
 | **Breeze-7B-Instruct-v0.1**                              |  10.74  |    11.1k                 |
 | **Breeze-7B-Instruct-64k-v0.1**                              | 10.74       |  88.8k            |
 | Qwen-7B                                                       |   10.86         |    9.8k                  |