manu
/

mistral-7B-v0.1

Model card Files Files and versions

mistral-7B-v0.1 / README.md

manu's picture

Update README.md

a045114 almost 2 years ago

|

history blame contribute delete

3.6 kB

	---
	language:
	- fr
	- en
	tags:
	- mistral
	---

	# mistral-7B-v0.1

	Released Sept. 27 by MistralAI with no further information.


	### Config

	```json
	{
	"dim": 4096,
	"n_layers": 32,
	"head_dim": 128,
	"hidden_dim": 14336,
	"n_heads": 32,
	"n_kv_heads": 8,
	"norm_eps": 1e-05,
	"sliding_window": 4096,
	"vocab_size": 32000
	}
	```

	### Training data

	Potentially, up to 8T tokens, with English, French, Code...

	https://twitter.com/ManuelFaysse/status/1706949891358859624


	### Magnet link

	`magnet:?xt=urn:btih:208b101a0f51514ecf285885a8b0f6fb1a1e4d7d&dn=mistral-7B-v0.1&tr=udp%3A%2F%http://2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=https%3A%2F%http://2Ftracker1.520.jp%3A443%2Fannounce`

	### Usage

	Probably something like the Llama2 models from the non-hf release ! Obviously, the model is probably not completely similar to Llama, so conversion to HF will not be so direct.
	Let's figure this out together !

	```bash
	torchrun --nproc_per_node 1 example_text_completion.py \
	--ckpt_dir llama-2-7b/ \
	--tokenizer_path tokenizer.model \
	--max_seq_len 128 --max_batch_size 4
	```

	`example_text_completion.py`
	```python
	# Copyright (c) Meta Platforms, Inc. and affiliates.
	# This software may be used and distributed according to the terms of the Llama 2 Community License Agreement.

	import fire

	from llama import Llama
	from typing import List

	def main(
	ckpt_dir: str,
	tokenizer_path: str,
	temperature: float = 0.6,
	top_p: float = 0.9,
	max_seq_len: int = 128,
	max_gen_len: int = 64,
	max_batch_size: int = 4,
	):
	"""
	Entry point of the program for generating text using a pretrained model.

	Args:
	ckpt_dir (str): The directory containing checkpoint files for the pretrained model.
	tokenizer_path (str): The path to the tokenizer model used for text encoding/decoding.
	temperature (float, optional): The temperature value for controlling randomness in generation.
	Defaults to 0.6.
	top_p (float, optional): The top-p sampling parameter for controlling diversity in generation.
	Defaults to 0.9.
	max_seq_len (int, optional): The maximum sequence length for input prompts. Defaults to 128.
	max_gen_len (int, optional): The maximum length of generated sequences. Defaults to 64.
	max_batch_size (int, optional): The maximum batch size for generating sequences. Defaults to 4.
	"""
	generator = Llama.build(
	ckpt_dir=ckpt_dir,
	tokenizer_path=tokenizer_path,
	max_seq_len=max_seq_len,
	max_batch_size=max_batch_size,
	)

	prompts: List[str] = [
	# For these prompts, the expected answer is the natural continuation of the prompt
	"I believe the meaning of life is",
	"Simply put, the theory of relativity states that ",
	"""A brief message congratulating the team on the launch:

	Hi everyone,

	I just """,
	# Few shot prompt (providing a few examples before asking model to complete more);
	"""Translate English to French:

	sea otter => loutre de mer
	peppermint => menthe poivrée
	plush girafe => girafe peluche
	cheese =>""",
	]
	results = generator.text_completion(
	prompts,
	max_gen_len=max_gen_len,
	temperature=temperature,
	top_p=top_p,
	)
	for prompt, result in zip(prompts, results):
	print(prompt)
	print(f"> {result['generation']}")
	print("\n==================================\n")


	if __name__ == "__main__":
	fire.Fire(main)
	```