jovyan
/

Swallow-MS-7b-v0.1-ChatVector

Text Generation

text-generation-inference

Model card Files Files and versions Community

jovyan commited on Mar 20, 2024

Commit

837105b

·

verified ·

1 Parent(s): c83bd2a

Update README.md

Files changed (1) hide show

README.md +55 -0

README.md CHANGED Viewed

@@ -1,3 +1,58 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+language:
+- ja
+- en
+library_name: transformers
+pipeline_tag: text-generation
+model_type: mistral
 ---
+# Swallow-MS-7b-v0.1-ChatVector
+Japanese "instruction tuned" model made by the technique of [Chat Vector](https://arxiv.org/abs/2310.04799)
+The weights of this model obtained not by any instruction tuning but by the following arithmetic:
+> [Swallow-MS-7b-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1) + [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) - [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
+## Instruction format
+The promot format should be the same as [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
+E.g.
+```
+text = "<s>[INST] What is your favourite condiment? [/INST]"
+"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s> "
+"[INST] Do you have mayonnaise recipes? [/INST]"
+```
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_name = "jovyan/Swallow-MS-7b-v0.1-ChatVector"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+prompt = "<s>[INST] 東京工業大学のキャンパスの特色を元気よく説明してください。 [/INST]"
+input_ids = tokenizer.encode(
+    prompt,
+    add_special_tokens=False,
+    return_tensors="pt"
+)
+tokens = model.generate(
+    input_ids.to(device=model.device),
+    max_new_tokens=128,
+    temperature=0.99,
+    top_p=0.95,
+    do_sample=True,
+)
+out = tokenizer.decode(tokens[0], skip_special_tokens=True)
+print(out)
+```