VRAM Requirements

#1
by practical-dreamer - opened

Can you please put approximate VRAM usage for this model? Thank you

No idea sorry, I don't actually use these myself I just did them as a favor for someone else who does.

Unfortunately, it needs more than 48GB if using 8192 context. On an A100 (80GB), you'll be at 72% GPU memory used when the model is initially loaded.

Yeah I tried splitting across two 3090s and got OOM. I’d be interested in a 4K length variant if it ever is made

Given that the 33b 16k tunes outperform the 33b 8k tunes at all context sizes, I suspect a 4k would do worse than this one, if anything a 16k might be better. But, I'm also waiting to see proper results from the new ntk by parts/ntkv2 finetune method.

If you're only after 4k you should be able to run this one with a max context of 4k, as long as compress_pos_emb is 4

Sign up or log in to comment