Update README.md
#4
by
luow-amd
- opened
README.md
CHANGED
@@ -5,7 +5,7 @@ license: llama3.1
|
|
5 |
- ## Introduction
|
6 |
This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
|
7 |
- ## Quantization Stragegy
|
8 |
-
- ***Quantized Layers
|
9 |
- ***Weight***: FP8 symmetric per-tensor
|
10 |
- ***Activation***: FP8 symmetric per-tensor
|
11 |
- ***KV Cache***: FP8 symmetric per-tensor
|
|
|
5 |
- ## Introduction
|
6 |
This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
|
7 |
- ## Quantization Stragegy
|
8 |
+
- ***Quantized Layers***: All linear layers excluding "lm_head"
|
9 |
- ***Weight***: FP8 symmetric per-tensor
|
10 |
- ***Activation***: FP8 symmetric per-tensor
|
11 |
- ***KV Cache***: FP8 symmetric per-tensor
|