Unknown quantization type, got fp8

#179
by DenisFavaCerchiaro - opened

Has someone any advise on how to fix this issue

Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq']

The error message "ValueError: Unknown quantization type, got fp8" indicates that the quantization type you specified ("fp8") is not directly supported by AutoQuantizationConfig.from_dict. It's important to note that the bnb_4bit_quant_type parameter in BitsAndBytesConfig expects a specific quantization method, and "fp8" isn't one of the primary options. Although "fbgemm_fp8" uses fp8, it is a specific method name and should be used if that is what you intended.

The traceback shows that the error occurs when trying to load the model with the specified quantization configuration. This suggests that the quantization configuration you're providing to the from_pretrained method is incorrect or not compatible with the selected model or the transformers library version you're using.

Sign up or log in to comment