`low_cpu_mem_usage` was None, now default to True since model is quantized. Loading checkpoint shards: 0%| | 0/4 [00:00