Model quantized using a modified EETQ repo. Currently working on decoupling its kernels from CUTLASS to make this a bit easier to use.
8bits.