Updated offical chat template (tool_calls fixed when None), removed silly tool_call_id length restriction and added padding token.

Browse files

Files changed (13) hide show

Mistral-Nemo-Instruct-2407-bf16-00001-of-00002.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ1_M.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ1_S.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ2_M.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ2_S.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ2_XS.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ2_XXS.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ3_M.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ3_S.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ3_XS.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ3_XXS.gguf +2 -2
Mistral-Nemo-Instruct-2407.IQ4_XS.gguf +2 -2
README.md +4 -2

Mistral-Nemo-Instruct-2407-bf16-00001-of-00002.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d4d22be5a7c68e75bd8263f8cedb55dd63e15cbcce60e3ff1416a776251a6540
-size 7863456

 version https://git-lfs.github.com/spec/v1
+oid sha256:309781e5cdd8449494b87b1bdbed3e298b39326bda812a0d4d5c52ab5cb2263f
+size 7863577

Mistral-Nemo-Instruct-2407.IQ1_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b940cc0ffd6e1a493afd429842fe99d967d2173b136eba35b0828a72f56f562a
-size 3221627296

 version https://git-lfs.github.com/spec/v1
+oid sha256:82d3b0045d45c4b78dc690025990093968ae580c1f6f0ca38640901b7199e369
+size 3221627424

Mistral-Nemo-Instruct-2407.IQ1_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:82dd0d71d3bae34a9776ec56b9b521bad2193b2f0e7d29002efed379db99d29a
-size 2999214496

 version https://git-lfs.github.com/spec/v1
+oid sha256:002714557f765d7940efe848cfd6f601acfd97161ec4d14dcc7119679164a8dc
+size 2999214624

Mistral-Nemo-Instruct-2407.IQ2_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:97f9afd43bc903b36d49781de3152e25ca6f91f848f01312647868250936b938
-size 4435026336

 version https://git-lfs.github.com/spec/v1
+oid sha256:c2f692c12f9f9e8b5dfeb44b1648b681f452038a4986b04aacbbfc429492bddd
+size 4435026464

Mistral-Nemo-Instruct-2407.IQ2_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:269e5f67b72c449603b58ac4ca7deb0b54ba688803d85542d02483f105770ebe
-size 4138475936

 version https://git-lfs.github.com/spec/v1
+oid sha256:5cf0270dda7405ffcc5970836685ca22a7852fae13c7f13b62a5e7f566dd5b13
+size 4138476064

Mistral-Nemo-Instruct-2407.IQ2_XS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8b4ff647558433d2d013201553c72c2e27d819d055435f75ff70fae3e3e723d2
-size 3915080096

 version https://git-lfs.github.com/spec/v1
+oid sha256:4e138bd461ab2d6b62677f10de62f7bb9798281f0b48d6e0d78574e3dddd01aa
+size 3915080224

Mistral-Nemo-Instruct-2407.IQ2_XXS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:332d9a99a4d1012c6adc650e696c6b3762dba47e05abcc21cde5925837bc2a30
-size 3592315296

 version https://git-lfs.github.com/spec/v1
+oid sha256:526ddece56b1d6757195141c88d01a3f9f08c1cdfb6e6fcdabf9c78ff106ad43
+size 3592315424

Mistral-Nemo-Instruct-2407.IQ3_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:adda54bc47e014739d3f54700dc352bfe9d7dc939a752402e4b562b65110bb5b
-size 5722235296

 version https://git-lfs.github.com/spec/v1
+oid sha256:3b4b79c5666fd2e89a4b7b04dc3a55d010620464b17791acd8950e89de349ee5
+size 5722235424

Mistral-Nemo-Instruct-2407.IQ3_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f9888c00e27c193ac59e230a04cad89f9925f66f54253e4e5eff0d423390dea7
-size 5562081696

 version https://git-lfs.github.com/spec/v1
+oid sha256:66b4e53beee6e62e2a5d113cdd7cd6c7e7b6b342e0f4c342bfddcedc48253ccd
+size 5562081824

Mistral-Nemo-Instruct-2407.IQ3_XS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:150ff66d862134c5a54423f11230c40233e3dc22af8f04fd8d129c6184965c36
-size 5306491296

 version https://git-lfs.github.com/spec/v1
+oid sha256:89ada2c4a1e75d3f19cc8fc06c63e22c81df9310f4fe5f1f0386a518acc96384
+size 5306491424

Mistral-Nemo-Instruct-2407.IQ3_XXS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f62b4cd119b6270dd92ec9effa9cefd97b910e1aa0dbdab6eaa4a05d30e91d20
-size 4945387936

 version https://git-lfs.github.com/spec/v1
+oid sha256:42cf29dd780480d524c085a6e71c68380fdf1cfef18f70d88fae6dfbcadfb710
+size 4945388064

Mistral-Nemo-Instruct-2407.IQ4_XS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:75cd95b015d33455a76b71a2cdeedc80d2100569654d62d375e5ce0f5b0982f4
-size 6742712736

 version https://git-lfs.github.com/spec/v1
+oid sha256:c76829b6d1b2cdd94160ddb084f52abc3dbbe52eb20181d966206b61c7da5c61
+size 6742712864

README.md CHANGED Viewed

@@ -19,6 +19,8 @@ quantized_by: CISC
 This repo contains State Of The Art quantized GGUF format model files for [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
 Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of [groups_merged-enhancedV3.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) and [wiki.train.raw](https://raw.githubusercontent.com/pytorch/examples/main/word_language_model/data/wikitext-2/train.txt) concatenated.
 The embedded chat template is the updated one with correct Tekken tokenization and function calling support via OpenAI-compatible `tools` parameter, see [example](#simple-llama-cpp-python-example-function-calling-code).
@@ -238,7 +240,7 @@ print(llm.create_chat_completion(
           "content": None,
           "tool_calls": [
             {
-              "id": "call__0_get_current_weather_cmpl-..."[:9], # Make sure to truncate ID (chat template requires it)
               "type": "function",
               "function": {
                 "name": "get_current_weather",
@@ -250,7 +252,7 @@ print(llm.create_chat_completion(
         { # The tool_call_id is from tool_calls and content is the result from the function call you made
           "role": "tool",
           "content": "20",
-          "tool_call_id": "call__0_get_current_weather_cmpl-..."[:9] # Make sure to truncate ID (chat template requires it)
         }
       ],
       tools=[{

 This repo contains State Of The Art quantized GGUF format model files for [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
+**August 16th update**: Updated offical chat template (`tool_calls` fixed when `None`), removed silly `tool_call_id` length restriction and added `padding` token.
 Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of [groups_merged-enhancedV3.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) and [wiki.train.raw](https://raw.githubusercontent.com/pytorch/examples/main/word_language_model/data/wikitext-2/train.txt) concatenated.
 The embedded chat template is the updated one with correct Tekken tokenization and function calling support via OpenAI-compatible `tools` parameter, see [example](#simple-llama-cpp-python-example-function-calling-code).
           "content": None,
           "tool_calls": [
             {
+              "id": "call__0_get_current_weather_cmpl-...",
               "type": "function",
               "function": {
                 "name": "get_current_weather",
         { # The tool_call_id is from tool_calls and content is the result from the function call you made
           "role": "tool",
           "content": "20",
+          "tool_call_id": "call__0_get_current_weather_cmpl-..."
         }
       ],
       tools=[{