RangiLyu commited on
Commit
edeb08e
·
verified ·
1 Parent(s): a277370

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -10
README.md CHANGED
@@ -191,30 +191,34 @@ print(decoded_output)
191
 
192
  ### Serving
193
 
194
- You can utilize one of the following LLM inference frameworks to create an OpenAI compatible server:
 
 
 
 
 
 
 
195
 
196
  #### [lmdeploy(>=0.9.2)](https://github.com/InternLM/lmdeploy)
197
 
198
- ```
199
  lmdeploy serve api_server internlm/Intern-S1 --reasoning-parser intern-s1 --tool-call-parser intern-s1 --tp 8
200
  ```
201
 
202
  #### [vllm](https://github.com/vllm-project/vllm)
203
 
204
- Coming soon.
 
 
205
 
206
  #### [sglang](https://github.com/sgl-project/sglang)
207
 
208
- Supporting Intern-S1 with SGLang is still in progress. Please refer to this [PR](https://github.com/sgl-project/sglang/pull/8350).
209
-
210
  ```bash
211
- CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
212
- python3 -m sglang.launch_server \
213
  --model-path internlm/Intern-S1 \
214
  --trust-remote-code \
215
- --mem-fraction-static 0.85 \
216
  --tp 8 \
217
- --enable-multimodal \
218
  --grammar-backend none
219
  ```
220
 
@@ -225,7 +229,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
225
  curl -fsSL https://ollama.com/install.sh | sh
226
  # fetch model
227
  ollama pull internlm/interns1
228
- # run model
229
  ollama run internlm/interns1
230
  # then use openai client to call on http://localhost:11434/v1
231
  ```
 
191
 
192
  ### Serving
193
 
194
+ The minimum hardware requirements for deploying Intern-S1 series models are:
195
+
196
+ | Model | A100(GPUs) | H800(GPUs) | H100(GPUs) | H200(GPUs) |
197
+ | :---------------------------------------------------------------------: | :--------: | :--------: | :--------: | :--------: |
198
+ | [internlm/Intern-S1](https://huggingface.co/internlm/Intern-S1) | 8 | 8 | 8 | 4 |
199
+ | [internlm/Intern-S1-FP8](https://huggingface.co/internlm/Intern-S1-FP8) | - | 4 | 4 | 2 |
200
+
201
+ You can utilize one of the following LLM inference frameworks to create an OpenAI compatible server:
202
 
203
  #### [lmdeploy(>=0.9.2)](https://github.com/InternLM/lmdeploy)
204
 
205
+ ```bash
206
  lmdeploy serve api_server internlm/Intern-S1 --reasoning-parser intern-s1 --tool-call-parser intern-s1 --tp 8
207
  ```
208
 
209
  #### [vllm](https://github.com/vllm-project/vllm)
210
 
211
+ ```bash
212
+ vllm serve internlm/Intern-S1 --tensor-parallel-size 8 --trust-remote-code
213
+ ```
214
 
215
  #### [sglang](https://github.com/sgl-project/sglang)
216
 
 
 
217
  ```bash
218
+ python3 -m sglang.launch_server \
 
219
  --model-path internlm/Intern-S1 \
220
  --trust-remote-code \
 
221
  --tp 8 \
 
222
  --grammar-backend none
223
  ```
224
 
 
229
  curl -fsSL https://ollama.com/install.sh | sh
230
  # fetch model
231
  ollama pull internlm/interns1
232
+ # run model
233
  ollama run internlm/interns1
234
  # then use openai client to call on http://localhost:11434/v1
235
  ```