amd
/

Meta-Llama-3.1-70B-Instruct-FP8-KV

Model card Files Files and versions Community

Update README.md

#4

by luow-amd - opened Sep 9

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ license: llama3.1
 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
-  - ***Quantized Layers***：All linear layers excluding "lm_head"
   - ***Weight***: FP8 symmetric per-tensor
   - ***Activation***: FP8 symmetric per-tensor
   - ***KV Cache***: FP8 symmetric  per-tensor

 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
+  - ***Quantized Layers***: All linear layers excluding "lm_head"
   - ***Weight***: FP8 symmetric per-tensor
   - ***Activation***: FP8 symmetric per-tensor
   - ***KV Cache***: FP8 symmetric  per-tensor