dranger003
/

CodeLlama-70b-Instruct-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

dranger003 commited on Feb 18, 2024

Commit

e67b83f

·

verified ·

1 Parent(s): a137779

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -6,6 +6,9 @@ pipeline_tag: text-generation
 GGUF importance matrix (imatrix) quants for https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf
 The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.
 | Layers | Context | Template |
 | --- | --- | --- |
 | <pre>0</pre> | <pre>4096</pre> | <pre>\<s\> Source: system<br><br> {instructions}\<step\> Source: user<br><br> {prompt}\<step\> Source: assistant<br>Destination: user<br><br> {response}</pre> |

 GGUF importance matrix (imatrix) quants for https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf
 The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.
+The template for this model is very sensitive and must be set very precisely.
+All whitespace are intended, and special tokens `<s>` and `<step>` must be encodded properly.
 | Layers | Context | Template |
 | --- | --- | --- |
 | <pre>0</pre> | <pre>4096</pre> | <pre>\<s\> Source: system<br><br> {instructions}\<step\> Source: user<br><br> {prompt}\<step\> Source: assistant<br>Destination: user<br><br> {response}</pre> |