amd
/

Meta-Llama-3.1-70B-Instruct-FP8-KV

Model card Files Files and versions Community

luow-amd commited on Sep 9

Commit

1b032c7

•

1 Parent(s): e3ba3bb

Update README.md (#4)

- Update README.md (9df306a899f1f6c614fcedeb7e28ec6e48b905e0)

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ license: llama3.1
 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
-  - ***Quantized Layers***：All linear layers excluding "lm_head"
   - ***Weight***: FP8 symmetric per-tensor
   - ***Activation***: FP8 symmetric per-tensor
   - ***KV Cache***: FP8 symmetric  per-tensor

 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
+  - ***Quantized Layers***: All linear layers excluding "lm_head"
   - ***Weight***: FP8 symmetric per-tensor
   - ***Activation***: FP8 symmetric per-tensor
   - ***KV Cache***: FP8 symmetric  per-tensor