Configuration Parsing
Warning:
In config.json: "quantization_config.bits" must be an integer
Notes
- 3.75bpw test quant of CausalLM/35b-beta-long, which is in itself a finetune of CohereForAI/c4ai-command-r-v01 (hence the corrected licensing).
- Theoretically should fit within 24GB of VRAM for inference.
TBA
Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+
No loras, no quants, no tricks, 30M+ sft data.
Pressure Testing from: https://github.com/LeonEricsson/llmcontext
- Downloads last month
- 13
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.