CISCai commited on
Commit
0e014b7
1 Parent(s): d2940b5

Updated offical chat template (tool_calls fixed when None), removed silly tool_call_id length restriction and added padding token.

Browse files
Mistral-Nemo-Instruct-2407-bf16-00001-of-00002.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d4d22be5a7c68e75bd8263f8cedb55dd63e15cbcce60e3ff1416a776251a6540
3
- size 7863456
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:309781e5cdd8449494b87b1bdbed3e298b39326bda812a0d4d5c52ab5cb2263f
3
+ size 7863577
Mistral-Nemo-Instruct-2407.IQ1_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b940cc0ffd6e1a493afd429842fe99d967d2173b136eba35b0828a72f56f562a
3
- size 3221627296
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82d3b0045d45c4b78dc690025990093968ae580c1f6f0ca38640901b7199e369
3
+ size 3221627424
Mistral-Nemo-Instruct-2407.IQ1_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:82dd0d71d3bae34a9776ec56b9b521bad2193b2f0e7d29002efed379db99d29a
3
- size 2999214496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:002714557f765d7940efe848cfd6f601acfd97161ec4d14dcc7119679164a8dc
3
+ size 2999214624
Mistral-Nemo-Instruct-2407.IQ2_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:97f9afd43bc903b36d49781de3152e25ca6f91f848f01312647868250936b938
3
- size 4435026336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2f692c12f9f9e8b5dfeb44b1648b681f452038a4986b04aacbbfc429492bddd
3
+ size 4435026464
Mistral-Nemo-Instruct-2407.IQ2_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:269e5f67b72c449603b58ac4ca7deb0b54ba688803d85542d02483f105770ebe
3
- size 4138475936
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5cf0270dda7405ffcc5970836685ca22a7852fae13c7f13b62a5e7f566dd5b13
3
+ size 4138476064
Mistral-Nemo-Instruct-2407.IQ2_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8b4ff647558433d2d013201553c72c2e27d819d055435f75ff70fae3e3e723d2
3
- size 3915080096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4e138bd461ab2d6b62677f10de62f7bb9798281f0b48d6e0d78574e3dddd01aa
3
+ size 3915080224
Mistral-Nemo-Instruct-2407.IQ2_XXS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:332d9a99a4d1012c6adc650e696c6b3762dba47e05abcc21cde5925837bc2a30
3
- size 3592315296
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:526ddece56b1d6757195141c88d01a3f9f08c1cdfb6e6fcdabf9c78ff106ad43
3
+ size 3592315424
Mistral-Nemo-Instruct-2407.IQ3_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:adda54bc47e014739d3f54700dc352bfe9d7dc939a752402e4b562b65110bb5b
3
- size 5722235296
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b4b79c5666fd2e89a4b7b04dc3a55d010620464b17791acd8950e89de349ee5
3
+ size 5722235424
Mistral-Nemo-Instruct-2407.IQ3_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f9888c00e27c193ac59e230a04cad89f9925f66f54253e4e5eff0d423390dea7
3
- size 5562081696
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66b4e53beee6e62e2a5d113cdd7cd6c7e7b6b342e0f4c342bfddcedc48253ccd
3
+ size 5562081824
Mistral-Nemo-Instruct-2407.IQ3_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:150ff66d862134c5a54423f11230c40233e3dc22af8f04fd8d129c6184965c36
3
- size 5306491296
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89ada2c4a1e75d3f19cc8fc06c63e22c81df9310f4fe5f1f0386a518acc96384
3
+ size 5306491424
Mistral-Nemo-Instruct-2407.IQ3_XXS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f62b4cd119b6270dd92ec9effa9cefd97b910e1aa0dbdab6eaa4a05d30e91d20
3
- size 4945387936
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42cf29dd780480d524c085a6e71c68380fdf1cfef18f70d88fae6dfbcadfb710
3
+ size 4945388064
Mistral-Nemo-Instruct-2407.IQ4_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:75cd95b015d33455a76b71a2cdeedc80d2100569654d62d375e5ce0f5b0982f4
3
- size 6742712736
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c76829b6d1b2cdd94160ddb084f52abc3dbbe52eb20181d966206b61c7da5c61
3
+ size 6742712864
README.md CHANGED
@@ -19,6 +19,8 @@ quantized_by: CISC
19
 
20
  This repo contains State Of The Art quantized GGUF format model files for [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
21
 
 
 
22
  Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of [groups_merged-enhancedV3.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) and [wiki.train.raw](https://raw.githubusercontent.com/pytorch/examples/main/word_language_model/data/wikitext-2/train.txt) concatenated.
23
 
24
  The embedded chat template is the updated one with correct Tekken tokenization and function calling support via OpenAI-compatible `tools` parameter, see [example](#simple-llama-cpp-python-example-function-calling-code).
@@ -238,7 +240,7 @@ print(llm.create_chat_completion(
238
  "content": None,
239
  "tool_calls": [
240
  {
241
- "id": "call__0_get_current_weather_cmpl-..."[:9], # Make sure to truncate ID (chat template requires it)
242
  "type": "function",
243
  "function": {
244
  "name": "get_current_weather",
@@ -250,7 +252,7 @@ print(llm.create_chat_completion(
250
  { # The tool_call_id is from tool_calls and content is the result from the function call you made
251
  "role": "tool",
252
  "content": "20",
253
- "tool_call_id": "call__0_get_current_weather_cmpl-..."[:9] # Make sure to truncate ID (chat template requires it)
254
  }
255
  ],
256
  tools=[{
 
19
 
20
  This repo contains State Of The Art quantized GGUF format model files for [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
21
 
22
+ **August 16th update**: Updated offical chat template (`tool_calls` fixed when `None`), removed silly `tool_call_id` length restriction and added `padding` token.
23
+
24
  Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of [groups_merged-enhancedV3.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) and [wiki.train.raw](https://raw.githubusercontent.com/pytorch/examples/main/word_language_model/data/wikitext-2/train.txt) concatenated.
25
 
26
  The embedded chat template is the updated one with correct Tekken tokenization and function calling support via OpenAI-compatible `tools` parameter, see [example](#simple-llama-cpp-python-example-function-calling-code).
 
240
  "content": None,
241
  "tool_calls": [
242
  {
243
+ "id": "call__0_get_current_weather_cmpl-...",
244
  "type": "function",
245
  "function": {
246
  "name": "get_current_weather",
 
252
  { # The tool_call_id is from tool_calls and content is the result from the function call you made
253
  "role": "tool",
254
  "content": "20",
255
+ "tool_call_id": "call__0_get_current_weather_cmpl-..."
256
  }
257
  ],
258
  tools=[{