seungahdev commited on
Commit
bed93de
1 Parent(s): 91693e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -194,15 +194,14 @@ extra_gated_button_content: Submit
194
  </p>
195
  <!-- header end -->
196
 
197
- # Llama 3.1 8B Instruct - FP8
198
 
199
  - Model creator: [Meta Llama 3.1](https://huggingface.co/meta-llama)
200
  - Original model: [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
201
 
202
  ## Description
203
 
204
- This repo contains the Llama 3 8B Instruct model quantized to FP8 by FriendliAI, significantly enhancing its inference efficiency while maintaining high accuracy.
205
- Note that FP8 is only supported by NVIDIA Ada, Hopper, and Blackwell GPU architectures.
206
  Check out [FriendliAI documentation](https://docs.friendli.ai/) for more details.
207
 
208
  ## License
@@ -263,7 +262,7 @@ docker run \
263
  -e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
264
  registry.friendli.ai/trial \
265
  --web-server-port 8000 \
266
- --hf-model-name FriendliAI/Meta-Llama-3.1-8B-Instruct-fp8 \
267
  --num-devices 1
268
  ```
269
 
 
194
  </p>
195
  <!-- header end -->
196
 
197
+ # Llama 3.1 8B Instruct - INT8
198
 
199
  - Model creator: [Meta Llama 3.1](https://huggingface.co/meta-llama)
200
  - Original model: [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
201
 
202
  ## Description
203
 
204
+ This repo contains the Llama 3 8B Instruct model quantized to INT8 by FriendliAI, significantly enhancing its inference efficiency while maintaining high accuracy.
 
205
  Check out [FriendliAI documentation](https://docs.friendli.ai/) for more details.
206
 
207
  ## License
 
262
  -e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
263
  registry.friendli.ai/trial \
264
  --web-server-port 8000 \
265
+ --hf-model-name FriendliAI/Meta-Llama-3.1-8B-Instruct-int8 \
266
  --num-devices 1
267
  ```
268