|
--- |
|
language: |
|
- en |
|
- es |
|
- it |
|
- de |
|
- fr |
|
license: apache-2.0 |
|
|
|
extra_gated_description: If you want to learn more about how we process your personal data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>. |
|
--- |
|
|
|
# Model Card for Mixtral-8x22B-Instruct-v0.1 |
|
|
|
|
|
## Encode and Decode with `mistral_common` |
|
|
|
```py |
|
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer |
|
from mistral_common.protocol.instruct.messages import UserMessage |
|
from mistral_common.protocol.instruct.request import ChatCompletionRequest |
|
|
|
mistral_models_path = "MISTRAL_MODELS_PATH" |
|
|
|
tokenizer = MistralTokenizer.v3() |
|
|
|
completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")]) |
|
|
|
tokens = tokenizer.encode_chat_completion(completion_request).tokens |
|
``` |
|
|
|
## Inference with `mistral_inference` |
|
|
|
```py |
|
from mistral_inference.model import Transformer |
|
from mistral_inference.generate import generate |
|
|
|
model = Transformer.from_folder(mistral_models_path) |
|
out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id) |
|
|
|
result = tokenizer.decode(out_tokens[0]) |
|
|
|
print(result) |
|
``` |
|
|
|
## Inference with hugging face `transformers` |
|
|
|
```py |
|
from transformers import AutoModelForCausalLM |
|
|
|
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1") |
|
model.to("cuda") |
|
|
|
generated_ids = model.generate(tokens, max_new_tokens=1000, do_sample=True) |
|
|
|
# decode with mistral tokenizer |
|
result = tokenizer.decode(generated_ids[0].tolist()) |
|
print(result) |
|
``` |
|
|
|
> [!TIP] |
|
> PRs to correct the `transformers` tokenizer so that it gives 1-to-1 the same results as the `mistral_common` reference implementation are very welcome! |
|
|
|
--- |
|
The Mixtral-8x22B-Instruct-v0.1 Large Language Model (LLM) is an instruct fine-tuned version of the [Mixtral-8x22B-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1). |
|
|
|
## Run the model |
|
```python |
|
from transformers import AutoModelForCausalLM |
|
from mistral_common.protocol.instruct.messages import ( |
|
AssistantMessage, |
|
UserMessage, |
|
) |
|
from mistral_common.protocol.instruct.tool_calls import ( |
|
Tool, |
|
Function, |
|
) |
|
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer |
|
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest |
|
|
|
device = "cuda" # the device to load the model onto |
|
|
|
tokenizer_v3 = MistralTokenizer.v3() |
|
|
|
mistral_query = ChatCompletionRequest( |
|
tools=[ |
|
Tool( |
|
function=Function( |
|
name="get_current_weather", |
|
description="Get the current weather", |
|
parameters={ |
|
"type": "object", |
|
"properties": { |
|
"location": { |
|
"type": "string", |
|
"description": "The city and state, e.g. San Francisco, CA", |
|
}, |
|
"format": { |
|
"type": "string", |
|
"enum": ["celsius", "fahrenheit"], |
|
"description": "The temperature unit to use. Infer this from the users location.", |
|
}, |
|
}, |
|
"required": ["location", "format"], |
|
}, |
|
) |
|
) |
|
], |
|
messages=[ |
|
UserMessage(content="What's the weather like today in Paris"), |
|
], |
|
model="test", |
|
) |
|
|
|
encodeds = tokenizer_v3.encode_chat_completion(mistral_query).tokens |
|
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1") |
|
model_inputs = encodeds.to(device) |
|
model.to(device) |
|
|
|
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True) |
|
sp_tokenizer = tokenizer_v3.instruct_tokenizer.tokenizer |
|
decoded = sp_tokenizer.decode(generated_ids[0]) |
|
print(decoded) |
|
``` |
|
Alternatively, you can run this example with the Hugging Face tokenizer. |
|
To use this example, you'll need transformers version 4.39.0 or higher. |
|
```console |
|
pip install transformers==4.39.0 |
|
``` |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "mistralai/Mixtral-8x22B-Instruct-v0.1" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
conversation=[ |
|
{"role": "user", "content": "What's the weather like in Paris?"}, |
|
{ |
|
"role": "tool_calls", |
|
"content": [ |
|
{ |
|
"name": "get_current_weather", |
|
"arguments": {"location": "Paris, France", "format": "celsius"}, |
|
|
|
} |
|
] |
|
}, |
|
{ |
|
"role": "tool_results", |
|
"content": {"content": 22} |
|
}, |
|
{"role": "assistant", "content": "The current temperature in Paris, France is 22 degrees Celsius."}, |
|
{"role": "user", "content": "What about San Francisco?"} |
|
] |
|
|
|
|
|
tools = [{"type": "function", "function": {"name":"get_current_weather", "description": "Get▁the▁current▁weather", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location."}},"required":["location","format"]}}}] |
|
|
|
# render the tool use prompt as a string: |
|
tool_use_prompt = tokenizer.apply_chat_template( |
|
conversation, |
|
chat_template="tool_use", |
|
tools=tools, |
|
tokenize=False, |
|
add_generation_prompt=True, |
|
|
|
) |
|
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1") |
|
|
|
inputs = tokenizer(tool_use_prompt, return_tensors="pt") |
|
|
|
outputs = model.generate(**inputs, max_new_tokens=20) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
# Instruct tokenizer |
|
The HuggingFace tokenizer included in this release should match our own. To compare: |
|
`pip install mistral-common` |
|
|
|
```py |
|
from mistral_common.protocol.instruct.messages import ( |
|
AssistantMessage, |
|
UserMessage, |
|
) |
|
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer |
|
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest |
|
|
|
from transformers import AutoTokenizer |
|
|
|
tokenizer_v3 = MistralTokenizer.v3() |
|
|
|
mistral_query = ChatCompletionRequest( |
|
messages=[ |
|
UserMessage(content="How many experts ?"), |
|
AssistantMessage(content="8"), |
|
UserMessage(content="How big ?"), |
|
AssistantMessage(content="22B"), |
|
UserMessage(content="Noice 🎉 !"), |
|
], |
|
model="test", |
|
) |
|
hf_messages = mistral_query.model_dump()['messages'] |
|
|
|
tokenized_mistral = tokenizer_v3.encode_chat_completion(mistral_query).tokens |
|
|
|
tokenizer_hf = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1') |
|
tokenized_hf = tokenizer_hf.apply_chat_template(hf_messages, tokenize=True) |
|
|
|
assert tokenized_hf == tokenized_mistral |
|
``` |
|
|
|
# Function calling and special tokens |
|
This tokenizer includes more special tokens, related to function calling : |
|
- [TOOL_CALLS] |
|
- [AVAILABLE_TOOLS] |
|
- [/AVAILABLE_TOOLS] |
|
- [TOOL_RESULTS] |
|
- [/TOOL_RESULTS] |
|
|
|
If you want to use this model with function calling, please be sure to apply it similarly to what is done in our [SentencePieceTokenizerV3](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/sentencepiece.py#L299). |
|
|
|
# The Mistral AI Team |
|
Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux, |
|
Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault, |
|
Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot, |
|
Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, |
|
Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, |
|
Jean-Malo Delignon, Jia Li, Justus Murke, Louis Martin, Louis Ternon, |
|
Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, |
|
Marie Torelli, Marie-Anne Lachaux, Nicolas Schuhl, Patrick von Platen, |
|
Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, |
|
Thibaut Lavril, Timothée Lacroix, Théophile Gervet, Thomas Wang, |
|
Valera Nemychnikova, William El Sayed, William Marshall |