What is v1/v2?
#1
by
clxy
- opened
V1 is the first version of GPTQ that did not support group size or act order on cuda and had a slightly different model format.
V2 is the newer version that is slower, but supports group sizes, act order, true sequential, etc.
Thanks for the clarification!
clxy
changed discussion status to
closed