jovyan commited on
Commit
837105b
·
verified ·
1 Parent(s): c83bd2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md CHANGED
@@ -1,3 +1,58 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - ja
5
+ - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ model_type: mistral
9
  ---
10
+ # Swallow-MS-7b-v0.1-ChatVector
11
+
12
+ Japanese "instruction tuned" model made by the technique of [Chat Vector](https://arxiv.org/abs/2310.04799)
13
+
14
+ The weights of this model obtained not by any instruction tuning but by the following arithmetic:
15
+ > [Swallow-MS-7b-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1) + [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) - [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
16
+
17
+ ## Instruction format
18
+
19
+ The promot format should be the same as [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
20
+
21
+ E.g.
22
+ ```
23
+ text = "<s>[INST] What is your favourite condiment? [/INST]"
24
+ "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s> "
25
+ "[INST] Do you have mayonnaise recipes? [/INST]"
26
+ ```
27
+
28
+ ## Usage
29
+
30
+ ```python
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+ import torch
33
+
34
+ model_name = "jovyan/Swallow-MS-7b-v0.1-ChatVector"
35
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
36
+ model = AutoModelForCausalLM.from_pretrained(
37
+ model_name,
38
+ torch_dtype=torch.bfloat16,
39
+ device_map="auto",
40
+ )
41
+
42
+ prompt = "<s>[INST] 東京工業大学のキャンパスの特色を元気よく説明してください。 [/INST]"
43
+ input_ids = tokenizer.encode(
44
+ prompt,
45
+ add_special_tokens=False,
46
+ return_tensors="pt"
47
+ )
48
+ tokens = model.generate(
49
+ input_ids.to(device=model.device),
50
+ max_new_tokens=128,
51
+ temperature=0.99,
52
+ top_p=0.95,
53
+ do_sample=True,
54
+ )
55
+
56
+ out = tokenizer.decode(tokens[0], skip_special_tokens=True)
57
+ print(out)
58
+ ```