llm-perf-leaderboard / hardware.yaml
baptistecolle's picture
Add torchao int4 weight only quantization as an option (#34)
8766911 verified
raw
history blame contribute delete
924 Bytes
- machine: 1xA10
description: A10-24GB-150W πŸ–₯️
hardware_provider: nvidia
hardware_type: cuda
subsets:
- unquantized
- awq
- bnb
- gptq
backends:
- pytorch
- machine: 1xA100
description: A100-80GB-275W πŸ–₯️
hardware_provider: nvidia
hardware_type: cuda
subsets:
- unquantized
- awq
- bnb
- gptq
- torchao
backends:
- pytorch
- machine: 1xT4
description: T4-16GB-70W πŸ–₯️
hardware_provider: nvidia
hardware_type: cuda
subsets:
- unquantized
- awq
- bnb
- gptq
- torchao
backends:
- pytorch
- machine: 32vCPU-C7i
description: Intel-Xeon-SPR-385W πŸ–₯️
detail: |
We tested the [32vCPU AWS C7i](https://aws.amazon.com/ec2/instance-types/c7i/) instance for the benchmark.
hardware_provider: intel
hardware_type: cpu
subsets:
- unquantized
backends:
- pytorch
- openvino
- onnxruntime