exl3 quants

#4
by homeworkace - opened

In the off-chance that you take requests from the Community tab, may I humbly ask for ExLlamav3 quants of c4ai-command-r-08-2024 and c4ai-command-a-03-2025? Whichever you think can fit into 24GB of VRAM!

Command-A is a huge model, so that's going to be tough in 24 GB. Not impossible, perhaps. But I converted Command-R now. It's here. 4.0bpw should fit in 24 GB with about a 40k context.

This is massive! Unfortunately I was unable to test it out, and I will open an issue on the new model page, but thanks for the effort!

homeworkace changed discussion status to closed

Sign up or log in to comment