Quantizations of https://huggingface.co/qingy2024/GRMR-2B-Instruct

Inference Clients/UIs


My note Use with llama.cpp like this:

llama-cli -m GRMR-2B-Instruct_quant.gguf -ngl 99 --conversation --temp 0.0 --reverse-prompt "Below is the original text. Please rewrite it to correct any grammatical errors if any, improve clarity, and enhance overall readability." --in-prefix "### Original Text:" --in-suffix "### Corrected Text:" --prompt " " --repeat-penalty 1.0

From original readme

This fine-tune of Gemma 2 2B is trained to take any input text and repeat it (with fixed grammar).

Example:

User: Find a clip from a professional production of any musical within the past 50 years. The Tony awards have a lot of great options of performances of Tony nominated performances in the archives on their websites.

GRMR-2B-Instruct: Find a clip from a professional production of any musical within the past 50 years. The Tony Awards have a lot of great options of performances of Tony-nominated performances in their archives on their websites.

Note: This model uses a custom chat template:

Below is the original text. Please rewrite it to correct any grammatical errors if any, improve clarity, and enhance overall readability.

### Original Text:
{PROMPT HERE}

### Corrected Text:
{MODEL'S OUTPUT HERE}

I would recommend a temperature of 0.0 and repeat penalty 1.0 for this model to get optimal results.

Disclaimer, I ran this text through the model itself to correct the grammar.

Downloads last month
664
GGUF
Model size
2.61B params
Architecture
gemma2

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.