Text Generation
Transformers
Safetensors
mixtral
Not-For-All-Audiences
nsfw
text-generation-inference
Inference Endpoints
Can you create a 2x20B varient since not everyone has a massive amount of VRAM to run this size of model?
#2
by
ReXommendation
- opened
It will still be large but more people can run it, especially if they have a P40 and like 32GB or less.