Spaces:

optimum
/

llm-perf-leaderboard

Running

llm-perf-leaderboard / hardware.yaml

Add torchao int4 weight only quantization as an option (#34)

8766911 verified 14 days ago

924 Bytes

	- machine: 1xA10
	description: A10-24GB-150W 🖥️
	hardware_provider: nvidia
	hardware_type: cuda
	subsets:
	- unquantized
	- awq
	- bnb
	- gptq
	backends:
	- pytorch

	- machine: 1xA100
	description: A100-80GB-275W 🖥️
	hardware_provider: nvidia
	hardware_type: cuda
	subsets:
	- unquantized
	- awq
	- bnb
	- gptq
	- torchao
	backends:
	- pytorch

	- machine: 1xT4
	description: T4-16GB-70W 🖥️
	hardware_provider: nvidia
	hardware_type: cuda
	subsets:
	- unquantized
	- awq
	- bnb
	- gptq
	- torchao
	backends:
	- pytorch

	- machine: 32vCPU-C7i
	description: Intel-Xeon-SPR-385W 🖥️
	detail: \|
	We tested the [32vCPU AWS C7i](https://aws.amazon.com/ec2/instance-types/c7i/) instance for the benchmark.
	hardware_provider: intel
	hardware_type: cpu
	subsets:
	- unquantized
	backends:
	- pytorch
	- openvino
	- onnxruntime