Text Generation
Transformers
Safetensors
llama
text-generation-inference
Inference Endpoints
yi-01-ai commited on
Commit
6868e97
1 Parent(s): 22e89cf

Auto Sync from git://github.com/01-ai/Yi.git/commit/81dbb7886c95ed0754a6b62805e8fa7bc7db60d8

Browse files
Files changed (1) hide show
  1. README.md +18 -16
README.md CHANGED
@@ -152,20 +152,20 @@ pipeline_tag: text-generation
152
  ## News
153
 
154
  <details open>
155
- <summary>🎯 <b>2024/03/06</b>: The Yi-9B is open-sourced and available to the public.</summary>
156
  <br>
157
- Yi-9B stands out as the top performer among a range of similar-sized open-source models (including Mistral-7B, SOLAR-10.7B, Gemma-7B, DeepSeek-Coder-7B-Base-v1.5 and more), particularly excelling in code, math, common-sense reasoning, and reading comprehension.
158
  </details>
159
 
160
  <details open>
161
- <summary>🎯 <b>2024/01/23</b>: The Yi-VL models, <code><a href="https://huggingface.co/01-ai/Yi-VL-34B">Yi-VL-34B</a></code> and <code><a href="https://huggingface.co/01-ai/Yi-VL-6B">Yi-VL-6B</a></code>, are open-sourced and available to the public.</summary>
162
  <br>
163
  <code><a href="https://huggingface.co/01-ai/Yi-VL-34B">Yi-VL-34B</a></code> has ranked <strong>first</strong> among all existing open-source models in the latest benchmarks, including <a href="https://arxiv.org/abs/2311.16502">MMMU</a> and <a href="https://arxiv.org/abs/2401.11944">CMMMU</a> (based on data available up to January 2024).</li>
164
  </details>
165
 
166
 
167
  <details>
168
- <summary>🎯 <b>2023/11/23</b>: <a href="#chat-models">Chat models</a> are open-sourced and available to the public.</summary>
169
  <br>This release contains two chat models based on previously released base models, two 8-bit models quantized by GPTQ, and two 4-bit models quantized by AWQ.
170
 
171
  - `Yi-34B-Chat`
@@ -182,11 +182,11 @@ You can try some of them interactively at:
182
  </details>
183
 
184
  <details>
185
- <summary>🔔 <b>2023/11/23</b>: The Yi Series Models Community License Agreement is updated to <a href="https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt">v2.1</a>.</summary>
186
  </details>
187
 
188
  <details>
189
- <summary>🔥 <b>2023/11/08</b>: Invited test of Yi-34B chat model.</summary>
190
  <br>Application form:
191
 
192
  - [English](https://cn.mikecrm.com/l91ODJf)
@@ -194,13 +194,13 @@ You can try some of them interactively at:
194
  </details>
195
 
196
  <details>
197
- <summary>🎯 <b>2023/11/05</b>: <a href="#base-models">The base models, </a><code>Yi-6B-200K</code> and <code>Yi-34B-200K</code>, are open-sourced and available to the public.</summary>
198
  <br>This release contains two base models with the same parameter sizes as the previous
199
  release, except that the context window is extended to 200K.
200
  </details>
201
 
202
  <details>
203
- <summary>🎯 <b>2023/11/02</b>: <a href="#base-models">The base models, </a><code>Yi-6B</code> and <code>Yi-34B</code>, are open-sourced and available to the public.</summary>
204
  <br>The first public release contains two bilingual (English/Chinese) base models
205
  with the parameter sizes of 6B and 34B. Both of them are trained with 4K
206
  sequence length and can be extended to 32K during inference time.
@@ -939,11 +939,11 @@ Before deploying Yi in your environment, make sure your hardware meets the follo
939
 
940
  | Model | Minimum VRAM | Recommended GPU Example |
941
  |----------------------|--------------|:-------------------------------------:|
942
- | Yi-6B-Chat | 15 GB | RTX 3090 <br> RTX 4090 <br> A10 <br> A30 |
943
- | Yi-6B-Chat-4bits | 4 GB | RTX 3060 <br> RTX 4060 |
944
- | Yi-6B-Chat-8bits | 8 GB | RTX 3070 <br> RTX 4060 |
945
  | Yi-34B-Chat | 72 GB | 4 x RTX 4090 <br> A800 (80GB) |
946
- | Yi-34B-Chat-4bits | 20 GB | RTX 3090 <br> RTX 4090 <br> A10 <br> A30 <br> A100 (40GB) |
947
  | Yi-34B-Chat-8bits | 38 GB | 2 x RTX 3090 <br> 2 x RTX 4090 <br> A800 (40GB) |
948
 
949
  Below are detailed minimum VRAM requirements under different batch use cases.
@@ -961,7 +961,7 @@ Below are detailed minimum VRAM requirements under different batch use cases.
961
 
962
  | Model | Minimum VRAM | Recommended GPU Example |
963
  |----------------------|--------------|:-------------------------------------:|
964
- | Yi-6B | 15 GB | RTX3090 <br> RTX4090 <br> A10 <br> A30 |
965
  | Yi-6B-200K | 50 GB | A800 (80 GB) |
966
  | Yi-9B | 20 GB | 1 x RTX 4090 (24 GB) |
967
  | Yi-34B | 72 GB | 4 x RTX 4090 <br> A800 (80 GB) |
@@ -1024,6 +1024,8 @@ With all these resources at your fingertips, you're ready to start your exciting
1024
  - [Benchmarks](#benchmarks)
1025
  - [Chat model performance](#chat-model-performance)
1026
  - [Base model performance](#base-model-performance)
 
 
1027
 
1028
  ## Ecosystem
1029
 
@@ -1032,8 +1034,8 @@ Yi has a comprehensive ecosystem, offering a range of tools, services, and model
1032
  - [Upstream](#upstream)
1033
  - [Downstream](#downstream)
1034
  - [Serving](#serving)
1035
- - [Quantitation](#️quantitation)
1036
- - [Fine-tuning](#️fine-tuning)
1037
  - [API](#api)
1038
 
1039
  ### Upstream
@@ -1158,7 +1160,7 @@ Yi-9B is almost the best among a range of similar-sized open-source models (incl
1158
 
1159
  ![Yi-9B benchmark - details](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_details.png?raw=true)
1160
 
1161
- - In terms of **overall** ability (`Mean-All), Yi-9B performs the best among similarly sized open-source models, surpassing DeepSeek-Coder, DeepSeek-Math, Mistral-7B, SOLAR-10.7B, and Gemma-7B.
1162
 
1163
  ![Yi-9B benchmark - overall](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_overall.png?raw=true)
1164
 
 
152
  ## News
153
 
154
  <details open>
155
+ <summary>🎯 <b>2024-03-06</b>: The <code>Yi-9B</code> is open-sourced and available to the public.</summary>
156
  <br>
157
+ <code>Yi-9B</code> stands out as the top performer among a range of similar-sized open-source models (including Mistral-7B, SOLAR-10.7B, Gemma-7B, DeepSeek-Coder-7B-Base-v1.5 and more), particularly excelling in code, math, common-sense reasoning, and reading comprehension.
158
  </details>
159
 
160
  <details open>
161
+ <summary>🎯 <b>2024-01-23</b>: The Yi-VL models, <code><a href="https://huggingface.co/01-ai/Yi-VL-34B">Yi-VL-34B</a></code> and <code><a href="https://huggingface.co/01-ai/Yi-VL-6B">Yi-VL-6B</a></code>, are open-sourced and available to the public.</summary>
162
  <br>
163
  <code><a href="https://huggingface.co/01-ai/Yi-VL-34B">Yi-VL-34B</a></code> has ranked <strong>first</strong> among all existing open-source models in the latest benchmarks, including <a href="https://arxiv.org/abs/2311.16502">MMMU</a> and <a href="https://arxiv.org/abs/2401.11944">CMMMU</a> (based on data available up to January 2024).</li>
164
  </details>
165
 
166
 
167
  <details>
168
+ <summary>🎯 <b>2023-11-23</b>: <a href="#chat-models">Chat models</a> are open-sourced and available to the public.</summary>
169
  <br>This release contains two chat models based on previously released base models, two 8-bit models quantized by GPTQ, and two 4-bit models quantized by AWQ.
170
 
171
  - `Yi-34B-Chat`
 
182
  </details>
183
 
184
  <details>
185
+ <summary>🔔 <b>2023-11-23</b>: The Yi Series Models Community License Agreement is updated to <a href="https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt">v2.1</a>.</summary>
186
  </details>
187
 
188
  <details>
189
+ <summary>🔥 <b>2023-11-08</b>: Invited test of Yi-34B chat model.</summary>
190
  <br>Application form:
191
 
192
  - [English](https://cn.mikecrm.com/l91ODJf)
 
194
  </details>
195
 
196
  <details>
197
+ <summary>🎯 <b>2023-11-05</b>: <a href="#base-models">The base models, </a><code>Yi-6B-200K</code> and <code>Yi-34B-200K</code>, are open-sourced and available to the public.</summary>
198
  <br>This release contains two base models with the same parameter sizes as the previous
199
  release, except that the context window is extended to 200K.
200
  </details>
201
 
202
  <details>
203
+ <summary>🎯 <b>2023-11-02</b>: <a href="#base-models">The base models, </a><code>Yi-6B</code> and <code>Yi-34B</code>, are open-sourced and available to the public.</summary>
204
  <br>The first public release contains two bilingual (English/Chinese) base models
205
  with the parameter sizes of 6B and 34B. Both of them are trained with 4K
206
  sequence length and can be extended to 32K during inference time.
 
939
 
940
  | Model | Minimum VRAM | Recommended GPU Example |
941
  |----------------------|--------------|:-------------------------------------:|
942
+ | Yi-6B-Chat | 15 GB | 1 x RTX 3090 <br> 1 x RTX 4090 <br> A10 <br> A30 |
943
+ | Yi-6B-Chat-4bits | 4 GB | 1 x RTX 3060 <br> 1 x RTX 4060 |
944
+ | Yi-6B-Chat-8bits | 8 GB | 1 x RTX 3070 <br> 1 x RTX 4060 |
945
  | Yi-34B-Chat | 72 GB | 4 x RTX 4090 <br> A800 (80GB) |
946
+ | Yi-34B-Chat-4bits | 20 GB | 1 x RTX 3090 <br> 1 x RTX 4090 <br> A10 <br> A30 <br> A100 (40GB) |
947
  | Yi-34B-Chat-8bits | 38 GB | 2 x RTX 3090 <br> 2 x RTX 4090 <br> A800 (40GB) |
948
 
949
  Below are detailed minimum VRAM requirements under different batch use cases.
 
961
 
962
  | Model | Minimum VRAM | Recommended GPU Example |
963
  |----------------------|--------------|:-------------------------------------:|
964
+ | Yi-6B | 15 GB | 1 x RTX 3090 <br> 1 x RTX 4090 <br> A10 <br> A30 |
965
  | Yi-6B-200K | 50 GB | A800 (80 GB) |
966
  | Yi-9B | 20 GB | 1 x RTX 4090 (24 GB) |
967
  | Yi-34B | 72 GB | 4 x RTX 4090 <br> A800 (80 GB) |
 
1024
  - [Benchmarks](#benchmarks)
1025
  - [Chat model performance](#chat-model-performance)
1026
  - [Base model performance](#base-model-performance)
1027
+ - [Yi-34B and Yi-34B-200K](#yi-34b-and-yi-34b-200k)
1028
+ - [Yi-9B](#yi-9b)
1029
 
1030
  ## Ecosystem
1031
 
 
1034
  - [Upstream](#upstream)
1035
  - [Downstream](#downstream)
1036
  - [Serving](#serving)
1037
+ - [Quantization](#quantization-1)
1038
+ - [Fine-tuning](#fine-tuning-1)
1039
  - [API](#api)
1040
 
1041
  ### Upstream
 
1160
 
1161
  ![Yi-9B benchmark - details](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_details.png?raw=true)
1162
 
1163
+ - In terms of **overall** ability (Mean-All), Yi-9B performs the best among similarly sized open-source models, surpassing DeepSeek-Coder, DeepSeek-Math, Mistral-7B, SOLAR-10.7B, and Gemma-7B.
1164
 
1165
  ![Yi-9B benchmark - overall](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_overall.png?raw=true)
1166