Vezora commited on
Commit
7761c1c
1 Parent(s): b64d635

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ This model is optimized for use with [VLLM](https://github.com/vllm-project/vllm
8
 
9
  ### Key Features of FP8 Marlin
10
 
11
- The Marlin kernel achieves impressive efficiency by packing 4 8-bit values into an int32 and performing a 4xFP8 to 4xFP16/BF16 dequantization using bit arithmetic and SIMT operations. This approach yields nearly a **2x speedup** over FP16 on most models while maintaining **near lossless quality**.
12
 
13
  #### FP8 Advantages on NVIDIA GPUs
14
 
 
8
 
9
  ### Key Features of FP8 Marlin
10
 
11
+ The NeuralMagic FP8 Marlin kernel achieves impressive efficiency by packing 4 8-bit values into an int32 and performing a 4xFP8 to 4xFP16/BF16 dequantization using bit arithmetic and SIMT operations. This approach yields nearly a **2x speedup** over FP16 on most models while maintaining **near lossless quality**.
12
 
13
  #### FP8 Advantages on NVIDIA GPUs
14