Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ Transformers implementation of [Pixtral-Large-Instruct-2411](https://huggingface
|
|
27 |
## Tokenizer And Prompt Template
|
28 |
Using conversion of v7m1 tokenizer with 32k vocab size.
|
29 |
|
30 |
-
Chat template in
|
31 |
|
32 |
```
|
33 |
<s>[SYSTEM_PROMPT] <system prompt>[/SYSTEM_PROMPT][INST] <user message>[/INST] <assistant response></s>[INST] <user message>[/INST]
|
@@ -36,7 +36,7 @@ Chat template in tokenizer_config.json uses the v7 instruct template:
|
|
36 |
## Notes
|
37 |
*- tool use hasn't been implemented in the template yet. I'll add this in later.*
|
38 |
*- I've added extra stop tokens between consecutive user messages. Helps contexts where there'll be multiple speakers etc but your milage may vary.*
|
39 |
-
*- If you have a better implementation of the tokenizer let me know and I'm happy to swap it out.*
|
40 |
*- As always pls respect the model license.*
|
41 |
|
42 |
Currently doing a fresh measurement run ahead of re-doing my exl2 quants which I'll upload. Apologies in advance if anything is wonky, tbh this is just a personal learning exercise for me and I decided to make this model my fixation to freshen up on my knowledge lol.
|
|
|
27 |
## Tokenizer And Prompt Template
|
28 |
Using conversion of v7m1 tokenizer with 32k vocab size.
|
29 |
|
30 |
+
Chat template in chat_template.json uses the v7 instruct template:
|
31 |
|
32 |
```
|
33 |
<s>[SYSTEM_PROMPT] <system prompt>[/SYSTEM_PROMPT][INST] <user message>[/INST] <assistant response></s>[INST] <user message>[/INST]
|
|
|
36 |
## Notes
|
37 |
*- tool use hasn't been implemented in the template yet. I'll add this in later.*
|
38 |
*- I've added extra stop tokens between consecutive user messages. Helps contexts where there'll be multiple speakers etc but your milage may vary.*
|
39 |
+
*- If you have a better implementation of the tokenizer let me know and I'm happy to swap it out.*
|
40 |
*- As always pls respect the model license.*
|
41 |
|
42 |
Currently doing a fresh measurement run ahead of re-doing my exl2 quants which I'll upload. Apologies in advance if anything is wonky, tbh this is just a personal learning exercise for me and I decided to make this model my fixation to freshen up on my knowledge lol.
|