Model has the same issue with memory
#1
by
ReXommendation
- opened
Consumes a lot of memory and the console crashes. Can you make a quan without an act order or just no act in general?
act-order has no effect on RAM or VRAM usage, it's group_size that affects VRAM usage and I already don't set that for 33B models to minimise VRAM.
Large RAM usage is normal on Windows - just increase your Pagefile size to around 100GB, which will allow the model to load into RAM and then be offloaded to GPU.
This is on arch manjaro