Spaces:
Running
Running
baptistecolle
HF staff
Add torchao int4 weight only quantization as an option (#34)
8766911
verified
- machine: 1xA10 | |
description: A10-24GB-150W π₯οΈ | |
hardware_provider: nvidia | |
hardware_type: cuda | |
subsets: | |
- unquantized | |
- awq | |
- bnb | |
- gptq | |
backends: | |
- pytorch | |
- machine: 1xA100 | |
description: A100-80GB-275W π₯οΈ | |
hardware_provider: nvidia | |
hardware_type: cuda | |
subsets: | |
- unquantized | |
- awq | |
- bnb | |
- gptq | |
- torchao | |
backends: | |
- pytorch | |
- machine: 1xT4 | |
description: T4-16GB-70W π₯οΈ | |
hardware_provider: nvidia | |
hardware_type: cuda | |
subsets: | |
- unquantized | |
- awq | |
- bnb | |
- gptq | |
- torchao | |
backends: | |
- pytorch | |
- machine: 32vCPU-C7i | |
description: Intel-Xeon-SPR-385W π₯οΈ | |
detail: | | |
We tested the [32vCPU AWS C7i](https://aws.amazon.com/ec2/instance-types/c7i/) instance for the benchmark. | |
hardware_provider: intel | |
hardware_type: cpu | |
subsets: | |
- unquantized | |
backends: | |
- pytorch | |
- openvino | |
- onnxruntime | |