https://arxiv.org/abs/2502.05003

IST Austria Distributed Algorithms and Systems Lab
university
AI & ML interests
None defined yet.
Recent Activity
Collections
4
Models prequantized with [HIGGS](https://arxiv.org/abs/2411.17525) zero-shot quantization. Requires the latest `transformers` to run.
-
Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Paper • 2411.17525 • Published -
ISTA-DASLab/Llama-3.3-70B-Instruct-HIGGS-GPTQ-4bit
Updated • 48 • 2 -
ISTA-DASLab/Llama-3.1-8B-Instruct-HIGGS-GPTQ-4bit
Text Generation • Updated • 16 -
ISTA-DASLab/Llama-3.1-8B-Instruct-HIGGS-GPTQ-3bit
Text Generation • Updated • 20
models
84

ISTA-DASLab/QuEST-800M-INT4
Text Generation
•
Updated

ISTA-DASLab/QuEST-800M-INT1
Updated

ISTA-DASLab/QuEST-800M-sparse-INT4
Updated

ISTA-DASLab/DeepSeek-R1-Distill-Qwen-14B-HIGGS-4bit
Text Generation
•
Updated
•
9

ISTA-DASLab/DeepSeek-R1-HIGGS-4bit
Updated
•
39

ISTA-DASLab/DeepSeek-R1-Distill-Qwen-32B-HIGGS-4bit
Text Generation
•
Updated
•
14

ISTA-DASLab/DeepSeek-R1-Distill-Llama-70B-HIGGS-4bit
Text Generation
•
Updated
•
25

ISTA-DASLab/QuEST-800M-BF16
Updated

ISTA-DASLab/Llama-3.3-70B-Instruct-HIGGS-GPTQ-4bit
Updated
•
48
•
2

ISTA-DASLab/Llama-3.1-8B-HIGGS-GPTQ-4bit
Text Generation
•
Updated
•
13