Change max_position_embeddings to original value
#18 opened 2 months ago
by
AshtonIsNotHere
Can you provide one model using `group_size=1024` to make the model smaller?
#15 opened 5 months ago
by
shuyuej
optimum version cannot support llama3.1 405b
#14 opened 5 months ago
by
Atomheart-Father
Source codes to quantize the LLaMA 3.1 405B model
3
#10 opened 5 months ago
by
shuyuej
quantization gptq_marlin (not found gptq_marlin) not work. , remove it. work.
8
#7 opened 5 months ago
by
linpan
Accuracy tradeoff
#6 opened 5 months ago
by
shaamil101
Value Error when trying to run
2
#4 opened 5 months ago
by
itaytricks