Updated offical chat template (tool_calls fixed when None), removed silly tool_call_id length restriction and added padding token.
Browse files- Mistral-Nemo-Instruct-2407-bf16-00001-of-00002.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ1_M.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ1_S.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ2_M.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ2_S.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ2_XS.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ2_XXS.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ3_M.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ3_S.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ3_XS.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ3_XXS.gguf +2 -2
- Mistral-Nemo-Instruct-2407.IQ4_XS.gguf +2 -2
- README.md +4 -2
Mistral-Nemo-Instruct-2407-bf16-00001-of-00002.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:309781e5cdd8449494b87b1bdbed3e298b39326bda812a0d4d5c52ab5cb2263f
|
3 |
+
size 7863577
|
Mistral-Nemo-Instruct-2407.IQ1_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:82d3b0045d45c4b78dc690025990093968ae580c1f6f0ca38640901b7199e369
|
3 |
+
size 3221627424
|
Mistral-Nemo-Instruct-2407.IQ1_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:002714557f765d7940efe848cfd6f601acfd97161ec4d14dcc7119679164a8dc
|
3 |
+
size 2999214624
|
Mistral-Nemo-Instruct-2407.IQ2_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c2f692c12f9f9e8b5dfeb44b1648b681f452038a4986b04aacbbfc429492bddd
|
3 |
+
size 4435026464
|
Mistral-Nemo-Instruct-2407.IQ2_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5cf0270dda7405ffcc5970836685ca22a7852fae13c7f13b62a5e7f566dd5b13
|
3 |
+
size 4138476064
|
Mistral-Nemo-Instruct-2407.IQ2_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4e138bd461ab2d6b62677f10de62f7bb9798281f0b48d6e0d78574e3dddd01aa
|
3 |
+
size 3915080224
|
Mistral-Nemo-Instruct-2407.IQ2_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:526ddece56b1d6757195141c88d01a3f9f08c1cdfb6e6fcdabf9c78ff106ad43
|
3 |
+
size 3592315424
|
Mistral-Nemo-Instruct-2407.IQ3_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3b4b79c5666fd2e89a4b7b04dc3a55d010620464b17791acd8950e89de349ee5
|
3 |
+
size 5722235424
|
Mistral-Nemo-Instruct-2407.IQ3_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:66b4e53beee6e62e2a5d113cdd7cd6c7e7b6b342e0f4c342bfddcedc48253ccd
|
3 |
+
size 5562081824
|
Mistral-Nemo-Instruct-2407.IQ3_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:89ada2c4a1e75d3f19cc8fc06c63e22c81df9310f4fe5f1f0386a518acc96384
|
3 |
+
size 5306491424
|
Mistral-Nemo-Instruct-2407.IQ3_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:42cf29dd780480d524c085a6e71c68380fdf1cfef18f70d88fae6dfbcadfb710
|
3 |
+
size 4945388064
|
Mistral-Nemo-Instruct-2407.IQ4_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c76829b6d1b2cdd94160ddb084f52abc3dbbe52eb20181d966206b61c7da5c61
|
3 |
+
size 6742712864
|
README.md
CHANGED
@@ -19,6 +19,8 @@ quantized_by: CISC
|
|
19 |
|
20 |
This repo contains State Of The Art quantized GGUF format model files for [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
|
21 |
|
|
|
|
|
22 |
Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of [groups_merged-enhancedV3.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) and [wiki.train.raw](https://raw.githubusercontent.com/pytorch/examples/main/word_language_model/data/wikitext-2/train.txt) concatenated.
|
23 |
|
24 |
The embedded chat template is the updated one with correct Tekken tokenization and function calling support via OpenAI-compatible `tools` parameter, see [example](#simple-llama-cpp-python-example-function-calling-code).
|
@@ -238,7 +240,7 @@ print(llm.create_chat_completion(
|
|
238 |
"content": None,
|
239 |
"tool_calls": [
|
240 |
{
|
241 |
-
"id": "call__0_get_current_weather_cmpl-..."
|
242 |
"type": "function",
|
243 |
"function": {
|
244 |
"name": "get_current_weather",
|
@@ -250,7 +252,7 @@ print(llm.create_chat_completion(
|
|
250 |
{ # The tool_call_id is from tool_calls and content is the result from the function call you made
|
251 |
"role": "tool",
|
252 |
"content": "20",
|
253 |
-
"tool_call_id": "call__0_get_current_weather_cmpl-..."
|
254 |
}
|
255 |
],
|
256 |
tools=[{
|
|
|
19 |
|
20 |
This repo contains State Of The Art quantized GGUF format model files for [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
|
21 |
|
22 |
+
**August 16th update**: Updated offical chat template (`tool_calls` fixed when `None`), removed silly `tool_call_id` length restriction and added `padding` token.
|
23 |
+
|
24 |
Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of [groups_merged-enhancedV3.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) and [wiki.train.raw](https://raw.githubusercontent.com/pytorch/examples/main/word_language_model/data/wikitext-2/train.txt) concatenated.
|
25 |
|
26 |
The embedded chat template is the updated one with correct Tekken tokenization and function calling support via OpenAI-compatible `tools` parameter, see [example](#simple-llama-cpp-python-example-function-calling-code).
|
|
|
240 |
"content": None,
|
241 |
"tool_calls": [
|
242 |
{
|
243 |
+
"id": "call__0_get_current_weather_cmpl-...",
|
244 |
"type": "function",
|
245 |
"function": {
|
246 |
"name": "get_current_weather",
|
|
|
252 |
{ # The tool_call_id is from tool_calls and content is the result from the function call you made
|
253 |
"role": "tool",
|
254 |
"content": "20",
|
255 |
+
"tool_call_id": "call__0_get_current_weather_cmpl-..."
|
256 |
}
|
257 |
],
|
258 |
tools=[{
|