Swallow-MS-7b-v0.1-ChatVector

Japanese "instruction tuned" model made by the technique of Chat Vector

The weights of this model are obtained not by any instruction tuning but by the following arithmetic:

Swallow-MS-7b-v0.1 + Mistral-7B-Instruct-v0.2 - Mistral-7B-v0.1


Chat Vectorの手法を使って、学習済み重みの足し引きのみでSwallow-MS-7b-v0.1モデルにチャット形式の対話能力を与えたモデルです。

詳細はこちらの日本語記事で解説しています。

Instruction format

The promot format should be the same as Mistral-7B-Instruct-v0.2.

E.g.

text = "<s>[INST] What is your favourite condiment? [/INST]"
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s> "
"[INST] Do you have mayonnaise recipes? [/INST]"

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "jovyan/Swallow-MS-7b-v0.1-ChatVector"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = "<s>[INST] 東京工業大学のキャンパスの特色を元気よく説明してください。 [/INST]"
input_ids = tokenizer.encode(
    prompt,
    add_special_tokens=False,
    return_tensors="pt"
)
tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=128,
    temperature=0.99,
    top_p=0.95,
    do_sample=True,
)

out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)
Downloads last month
10
Safetensors
Model size
7.33B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jovyan/Swallow-MS-7b-v0.1-ChatVector

Quantizations
1 model