chrisociepa
commited on
Commit
•
188fab2
1
Parent(s):
964e1ec
Update README.md
Browse files
README.md
CHANGED
@@ -12,10 +12,10 @@ pipeline_tag: text-generation
|
|
12 |
base_model: speakleash/Bielik-11B-v2.1-Instruct
|
13 |
---
|
14 |
<p align="center">
|
15 |
-
<img src="https://huggingface.co/speakleash/Bielik-
|
16 |
</p>
|
17 |
|
18 |
-
# Bielik-11B-v2.
|
19 |
|
20 |
This model was obtained by quantizing the weights and activations of [Bielik-11B-v.2.1-Instruct](https://huggingface.co/speakleash/Bielik-11B-v2.1-Instruct) to FP8 data type, ready for inference with vLLM >= 0.5.0 or SGLang.
|
21 |
AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
|
@@ -98,4 +98,4 @@ print(response)
|
|
98 |
|
99 |
## Contact Us
|
100 |
|
101 |
-
If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our [Discord SpeakLeash](https://discord.gg/
|
|
|
12 |
base_model: speakleash/Bielik-11B-v2.1-Instruct
|
13 |
---
|
14 |
<p align="center">
|
15 |
+
<img src="https://huggingface.co/speakleash/Bielik-11B-v2/raw/main/speakleash_cyfronet.png">
|
16 |
</p>
|
17 |
|
18 |
+
# Bielik-11B-v2.1-Instruct-FP8
|
19 |
|
20 |
This model was obtained by quantizing the weights and activations of [Bielik-11B-v.2.1-Instruct](https://huggingface.co/speakleash/Bielik-11B-v2.1-Instruct) to FP8 data type, ready for inference with vLLM >= 0.5.0 or SGLang.
|
21 |
AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
|
|
|
98 |
|
99 |
## Contact Us
|
100 |
|
101 |
+
If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our [Discord SpeakLeash](https://discord.gg/pv4brQMDTy).
|