πŸ“Ÿ Relay v0.1 (Mistral Nemo 2407)

This model page includes GGUF versions of relay-v0.1-Mistral-Nemo-2407. For more details about this model, please see that model page.

Note: If you have access to a CUDA GPU, it's highly recommended you use the main version (HF) of the model with the relaylm.py script, which supports better use of commands (e.g., system messages). The relaylm.py script also supports 4bit and 8bit bitsandbytes quants.

Custom Preset for LM Studio

To use these GGUF files with LM Studio, you should use this preset configuration. Relay models use ChatML, but not standard roles and system prompts.

After you select and download the GGUF version you want to use:

  • Go to the My Models tab.
  • Click the button with the template name for the model (e.g., ChatML).
  • Click Import Preset from file..., and select the file.
  • Confirm that the model is set to use the relay preset (see screenshot below): image/png
Downloads last month
130
GGUF
Model size
12.2B params
Architecture
llama

2-bit

4-bit

8-bit

16-bit

Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for danlou/relay-v0.1-Mistral-Nemo-2407-GGUF

Quantized
(4)
this model

Dataset used to train danlou/relay-v0.1-Mistral-Nemo-2407-GGUF

Collection including danlou/relay-v0.1-Mistral-Nemo-2407-GGUF