seungahdev
commited on
Commit
•
bed93de
1
Parent(s):
91693e1
Update README.md
Browse files
README.md
CHANGED
@@ -194,15 +194,14 @@ extra_gated_button_content: Submit
|
|
194 |
</p>
|
195 |
<!-- header end -->
|
196 |
|
197 |
-
# Llama 3.1 8B Instruct -
|
198 |
|
199 |
- Model creator: [Meta Llama 3.1](https://huggingface.co/meta-llama)
|
200 |
- Original model: [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
|
201 |
|
202 |
## Description
|
203 |
|
204 |
-
This repo contains the Llama 3 8B Instruct model quantized to
|
205 |
-
Note that FP8 is only supported by NVIDIA Ada, Hopper, and Blackwell GPU architectures.
|
206 |
Check out [FriendliAI documentation](https://docs.friendli.ai/) for more details.
|
207 |
|
208 |
## License
|
@@ -263,7 +262,7 @@ docker run \
|
|
263 |
-e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
|
264 |
registry.friendli.ai/trial \
|
265 |
--web-server-port 8000 \
|
266 |
-
--hf-model-name FriendliAI/Meta-Llama-3.1-8B-Instruct-
|
267 |
--num-devices 1
|
268 |
```
|
269 |
|
|
|
194 |
</p>
|
195 |
<!-- header end -->
|
196 |
|
197 |
+
# Llama 3.1 8B Instruct - INT8
|
198 |
|
199 |
- Model creator: [Meta Llama 3.1](https://huggingface.co/meta-llama)
|
200 |
- Original model: [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
|
201 |
|
202 |
## Description
|
203 |
|
204 |
+
This repo contains the Llama 3 8B Instruct model quantized to INT8 by FriendliAI, significantly enhancing its inference efficiency while maintaining high accuracy.
|
|
|
205 |
Check out [FriendliAI documentation](https://docs.friendli.ai/) for more details.
|
206 |
|
207 |
## License
|
|
|
262 |
-e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
|
263 |
registry.friendli.ai/trial \
|
264 |
--web-server-port 8000 \
|
265 |
+
--hf-model-name FriendliAI/Meta-Llama-3.1-8B-Instruct-int8 \
|
266 |
--num-devices 1
|
267 |
```
|
268 |
|