|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- pretrained |
|
pipeline_tag: text-generation |
|
inference: |
|
parameters: |
|
temperature: 0.7 |
|
--- |
|
|
|
# Model Card for Mistral-7B-v0.1 |
|
|
|
The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. |
|
Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested. |
|
|
|
For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/). |
|
|
|
Mistral-7B-v0.1 has the following characteristics: |
|
- 7.3B parameters |
|
- Byte-fallback BPE tokenizer |
|
- Grouped-Query Attention |
|
- 8k context window |
|
- 4k Sliding-Window Attention |
|
- 32000 vocab size |
|
|
|
## How to use |
|
|
|
It is recommended to use `mistralai/Mistral-7B-v0.1` with [mistral_inference](https://github.com/mistralai/mistral-inference). For HF `transformers` code snippets, please keep scrolling. |
|
|
|
## Generate with `mistral_inference` |
|
|
|
### Install dependencies |
|
``` |
|
pip install mistral_inference |
|
``` |
|
|
|
### Download model |
|
|
|
```py |
|
from huggingface_hub import snapshot_download |
|
from pathlib import Path |
|
|
|
mistral_models_path = Path.home().joinpath('mistral_models', '7B-v0.1') |
|
mistral_models_path.mkdir(parents=True, exist_ok=True) |
|
|
|
snapshot_download(repo_id="mistralai/Mistral-7B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model"], local_dir=mistral_models_path) |
|
``` |
|
|
|
### Demo |
|
|
|
After installing `mistral_inference`, a `mistral-demo` CLI command should be available in your environment. |
|
|
|
``` |
|
mistral-demo $HOME/mistral_models/7B-v0.1 |
|
``` |
|
|
|
Should give something along the following lines: |
|
|
|
``` |
|
This is a test of the emergency broadcast system. This is only a test. |
|
|
|
If this were a real emergency, you would be told what to do. |
|
|
|
This is a test |
|
===================== |
|
This is another test of the new blogging software. I’m not sure if I’m going to keep it or not. I’m not sure if I’m going to keep |
|
===================== |
|
This is a third test, mistral AI is very good at testing. 🙂 |
|
|
|
This is a third test, mistral AI is very good at testing. 🙂 |
|
|
|
This |
|
===================== |
|
``` |
|
|
|
## Generate with `transformers` |
|
|
|
### Install dependencies |
|
``` |
|
pip install transformers |
|
``` |
|
|
|
### Text completion |
|
|
|
```py |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "mistralai/Mistral-7B-v0.1" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
inputs = tokenizer("Hello my name is", return_tensors="pt") |
|
|
|
outputs = model.generate(**inputs, max_new_tokens=20) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Notice |
|
Mistral-7B is a pretrained base model and therefore does not have any moderation mechanisms. |
|
|
|
## The Mistral AI Team |
|
|
|
Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed. |